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Preface 


A few decades ago, when the science of 
cognition was in its infancy, the early text- 
books on cognition began with perception 
and attention and ended with memory. So- 
called higher-level cognition — the mysteri- 
ous, complicated realm of thinking and rea- 
soning — was simply left out. Things have 
changed — any good cognitive text (and there 
are many) devotes several chapters to topics 
such as categorization, inductive and deduc- 
tive reasoning, judgment and decision mak- 
ing, and problem solving. What has still been 
missing, however, is a true handbook for 
the field of thinking and reasoning — a book 
meant to be kept close “at hand” by those in- 
volved in the field. Such a book would bring 
together top researchers to write chapters, 
each of which summarizes the basic con- 
cepts and findings for a major topic, sketches 
its history, and provides a sense of the di- 
rections in which research is currently head- 
ing. This handbook would provide quick 
overviews for experts in each topic area, and 
more importantly for experts in allied topic 
areas (because few researchers can keep up 
with the scientific literature over the full 


breadth of the field of thinking and rea- 


soning). Even more crucially, this handbook 
would provide an entry point into the field 
for the next generation of researchers by pro- 
viding a text for use in classes on thinking and 
reasoning designed for graduate students and 
upper-level undergraduates. 

The Cambridge Handbook of Thinking and 
Reasoning is intended to be this previously 
missing handbook. The project was first con- 
ceived at the meeting of the Cognitive Sci- 
ence Society in Edinburgh, Scotland, dur- 
ing the summer of 2001. The contents of 
the volume are sketched in Chapter 1. Our 
aim is to provide comprehensive and au- 
thoritative reviews of all the core topics of 
the field of thinking and reasoning, with 
many pointers for further reading. Undoubt- 
edly, there are still omissions, but we have 
included as much as we could realistically 
fit in a single volume. Our focus is on re- 
search from cognitive psychology, cognitive 
science, and cognitive neuroscience, but we 
also include work related to developmen- 
tal, social, and clinical psychology; philos- 
ophy; economics; artificial intelligence; lin- 
guistics; education; law; and medicine. We 
hope that scholars and students in all these 
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fields and otherewa bia thy taltpsy Afiiitibnary iahproblems sometimes experience a mo- 


collection. 

We have many to thank for their help 
in bringing this endeavor to fruition. Philip 
Laughlin, our editor at Cambridge Univer- 
sity Press, gave us exactly the balance of 
encouragement and patience we needed. It 
is fitting that a handbook of thinking and 
reasoning should bear the imprint and in- 
deed the name of this illustrious press, with 
its long history reaching back to the ori- 
gins of scientific inquiry. Michie Shaw, Se- 
nior Project Manager at TechBooks, pro- 
vided us with close support throughout the 
arduous editing process. At UCLA, Chris- 
tine Vu did a great deal of organizational 
work in her role as our editorial assistant 
for the entire project. During this period, 
our own efforts were supported by grants 
R305Ho030141 from the Institute of Educa- 
tion Sciences and SES-0080375 from the 
National Science Foundation to KJH, and 
from Xunesis and National Service Research 
Award MH-064244 from the National Insti- 
tute of Mental Health to RGM. 

Then there are the authors. (It would 
seem a bit presumptuous to call them “our” 
authors!) People working on tough intellec- 


ment of insight — a sense that although many 
laborious steps may lay ahead, the basic ele- 
ments of a solution are already in place. Such 
fortunate people work on happily, confident 
that ultimate success is assured. In preparing 
this handbook, we also had our moment of 
“insight.” It came when all these outstanding 
researchers agreed to join our project. Be- 
fore the first chapter was drafted, we knew 
the volume was going to be of the highest 
quality. Along the way, our distinguished au- 
thors graciously served as each other’s crit- 
ics as we passed drafts around, working to 
make the chapters as integrated as possible, 
adding in pointers from one to another. Then 
the authors all changed hats again and went 
back to work revising their own chapters in 
light of the feedback their peers had pro- 
vided. We thank you all for making our own 
small labors a great pleasure. 


KEITH J. HOLYOAK 
University of California, Los Angeles 


ROBERT G. MORRISON 
Xunesis, Chicago 
October 2004 
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CHAPTER 1 


Thinking and Reasoning: 
A Reader’s Guide 


Keith J. Holyoak 
Robert G. Morrison 


“Cogito, ergo sum,” the French philosopher 
René Descartes famously declared, “I think, 
therefore I am.” Every normal human adult 
shares a sense that the ability to think, to rea- 
son, is a part of their fundamental identity. 
A person may be struck blind or deaf, yet 
we still recognize his or her core cognitive 
capacities as intact. Even loss of language, 
the gift often claimed as the sine qua non 
of homo sapiens, does not take away a per- 
son’s essential humanness. Unlike language 
ability, which is essentially unique to our 
species, the rudimentary ability to think and 
reason is apparent in nonhuman primates 
(see Call & Tomasello, Chap. 25); and yet it 
is thinking, not language, that lies closest to 
the core of our individual identity. A person 
who loses language but can still make intel- 
ligent decisions, as demonstrated by actions, 
is viewed as mentally intact. In contrast, the 
kinds of brain damage that rob an individ- 
ual of the capacity to think and reason are 
considered the harshest blows that can be 
struck against a sense of personhood. Cogito, 
ergo sum. 


What Is Thinking? 


We can start to answer this question by look- 
ing at the various ways the word “think- 
ing” is used in everyday language. “I think 
that water is necessary for life” and “George 
thinks the Pope is a communist” both ex- 
press beliefs (of varying degrees of appar- 
ent plausibility), that is, explicit claims of 
what someone takes to be a truth about the 
world. “Anne is sure to think of a solution” 
carries us into the realm of problem solv- 
ing, the mental construction of an action 
plan to achieve a goal. The complaint “Why 
didn’t you think before you went ahead with 
your half-baked scheme?” emphasizes that 
thinking can be a kind of foresight, a way 
of “seeing” the possible future.’ “What do 
you think about it?” calls for a judgment, 
an assessment of the desirability of an op- 
tion. Then there’s “Albert is lost in thought,” 
where thinking becomes some sort of mental 
meadow through which a person might me- 
ander on a rainy afternoon, oblivious to the 
world outside. 
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Rips and Rremeeka 
ments from college students about how var- 
ious mentalistic terms relate to one another. 
Using statistical techniques, the investigators 
were able to summarize these relationships 
in two diagrams, shown in Figure 1.1. Fig- 
ure 1.1(A) is a hierarchy of kinds, or cat- 
egories. Roughly, people believe planning 
is a kind of deciding, which is a kind of 
reasoning, which is a kind of conceptual- 
izing, which is a kind of thinking. People 
also believe that thinking is part of con- 
ceptualizing, which is part of remembering, 
which is part of reasoning, and so on [Fig- 
ure 1.1(B)]. The kinds ordering and the parts 
ordering are similar; most strikingly, “think- 
ing” is the most general term in both order- 
ings — the grand superordinate of mental ac- 
tivities, which permeates all the others. 

It is not easy to make the move from the 
free flow of everyday speech to scientific def- 
initions of mental terms, but let us nonethe- 
less offer a preliminary definition of thinking 
to suggest what this book is about: Thinking 
is the systematic transformation of mental rep- 
resentations of knowledge to characterize ac- 
tual or possible states of the world, often in 
service of goals. Obviously, our definition in- 
troduces a plethora of terms with meanings 
that beg to be unpacked, but at which we can 
only hint. A mental representation of knowl- 
edge is an internal description that can be 
manipulated to form other descriptions. To 
count as thinking, the manipulations must 
be systematic transformations governed by 
certain constraints. Whether a logical deduc- 
tion or a creative leap, what we mean by 
thinking is more than unconstrained associ- 
ations (with the caveat that thinking may in- 
deed be disordered; see Bachman & Cannon, 
Chap. 21). The internal representations cre- 
ated by thinking describe states of some ex- 
ternal world (a world that may include the 
thinker as an object of self-reflection) — that 
world might be our everyday one, or per 
haps some imaginary construction obeying 
the “laws” of magical realism. Often (not 
always — the daydreamer, and indeed the 
night dreamer, are also thinkers), thinking 
is directed toward achieving some desired 


eos) itpsvégaiianarystaumof affairs, some goal that motivates the 


thinker to perform mental work. 

Our definition thus includes quite a few 
stipulations, but notice also what is left out. 
We do not claim that thinking necessarily 
requires a human (higher-order primates, 
and perhaps some other species on this or 
other planets, have a claim to be considered 
thinkers) (see Call & Tomasello, Chap. 25) 
or even a sentient being. (The field of ar- 
tificial intelligence may have been a disap- 
pointment in its first half-century, but we 
are reluctant to define it away as an oxy- 
moron.) Nonetheless, our focus in this book 
is on thinking by hominids with electro- 
chemically powered brains. Thinking often 
seems to be a conscious activity of which 
the thinker is aware (cogito, ergo sum); how- 
ever, consciousness is a thorny philosophi- 
cal puzzle, and some mental activities seem 
pretty much like thinking, except for being 
implicit rather than explicit (see Litman & 
Reber, Chap. 18). Finally, we do not claim 
that thinking is inherently rational, optimal, 
desirable, or even smart. A thorough history 
of human thinking will include quite a few 
chapters on stupidity. 

The study of thinking includes several in- 
terrelated subfields that reflect slightly dif- 
ferent perspectives on thinking. Reasoning, 
which has a long tradition that springs from 
philosophy and logic, places emphasis on the 
process of drawing inferences (conclusions) 
from some initial information (premises). In 
standard logic, an inference is deductive if the 
truth of the premises guarantees the truth 
of the conclusion by virtue of the argument 
form. If the truth of the premises renders the 
truth of the conclusion more credible but 
does not bestow certainty, the inference is 
called inductive.? Judgment and decision mak- 
ing involve assessment of the value of an 
option or the probability that it will yield 
a certain payoff (judgment) coupled with 
choice among alternatives (decision mak- 
ing). Problem solving involves the construc- 
tion of a course of action that can achieve a 
goal. 

Although these distinct perspectives on 
thinking are useful in organizing the field 
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Figure 1.1. People’s conceptions of the relationships among terms for mental activities. A, Ordering 
of “kinds.” B, Ordering of “parts.” (Adapted from Rips & Conrad, 1989, with permission.) 


(and this volume), these aspects of thinking 
overlap in every conceivable way. To solve 
a problem, one is likely to reason about the 
consequences of possible actions and make 
decisions to select among alternative actions. 
A logic problem, as the name implies, is a 
problem to be solved (with the goal of de- 
riving or evaluating a possible conclusion). 
Making a decision is often a problem that 
requires reasoning. These subdivisions of the 
field, like our preliminary definition of think- 
ing, should be treated as guideposts, not 
destinations. 


A Capsule History 


Thinking and reasoning, long the academic 
province of philosophy, have over the past 
century emerged as core topics of empirical 
investigation and theoretical analysis in the 
modern fields known as cognitive psychol- 
ogy, cognitive science, and cognitive neuro- 
science. Before psychology was founded, the 


eighteenth-century philosophers Immanuel 
Kant (in Germany) and David Hume (in 
Scotland) laid the foundations for all subse- 
quent work on the origins of causal knowl- 
edge, perhaps the most central problem in 
the study of thinking (see Buehner & Cheng, 
Chap. 7). If we were to choose one phrase 
to set the stage for modern views of think- 
ing, it would be an observation of the British 
philosopher Thomas Hobbes, who, in 1651, 
in his treatise Leviathan, proposed, “Rea- 
soning is but reckoning.” “Reckoning” is an 
odd term today, but in the seventeenth cen- 
tury it meant computation, as in arithmetic 
calculations.3 

It was not until the twentieth century that 
the psychology of thinking became a scien- 
tific endeavor. The first half of the century 
gave rise to many important pioneers who 
in very different ways laid the foundations 
for the emergence of the modern field of 
thinking and reasoning. Foremost were the 
Gestalt psychologists of Germany, who pro- 
vided deep insights into the nature of prob- 
lem solving (see Novick & Bassok, Chap. 14). 
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Duncker and Max Wertheimer, students 
of human problem solving, and Wolfgang 
Kohler, a keen observer of problem solv- 
ing by great apes (see Call & Tomasello, 
Chap. 25). 

The pioneers of the early twentieth cen- 
tury also include Sigmund Freud, whose 
complex and ever-controversial legacy in- 
cludes the notions that forms of thought 
can be unconscious (see Litman & Reber, 
Chap. 18) and that “cold” cognition is tan- 
gled up with “hot” emotion (see Molden & 
Higgins, Chap. 13). As the founder of clini- 
cal psychology, Freud’s legacy also includes 
the ongoing integration of research on nor- 
mal thinking with studies of thought disor- 
ders, such as schizophrenia (see Bachman & 
Cannon, Chap. 21). 

Other early pioneers in the early and 
mid-twentieth century contributed to vari- 
ous fields of study that are now embraced 
within thinking and reasoning. Cognitive de- 
velopment continues to be influenced by the 
early theories developed by the Swiss psy- 
chologist Jean Piaget (see Halford, Chap. 22) 
and the Russian psychologist Lev Vygotsky 
(see Greenfield, Chap. 27). In the United 
States, Charles Spearman was a leader in the 
systematic study of individual differences in 
intelligence (see Sternberg, Chap. 31). In the 
middle of the century, the Russian neurolo- 
gist Alexander Luria made immense contri- 
butions to our understanding of how think- 
ing depends on specific areas of the brain, 
anticipating the modern field of cognitive 
neuroscience (see Goel, Chap. 20). Around 
the same time, in the United States, Herbert 
Simon argued that the traditional rational 
model of economic theory should be re- 
placed with a framework that accounted for 
a variety of human resource constraints such 
as bounded attention and memory capac- 
ity and limited time (see LeBoeuf & Shafir, 
Chap. 11, and Morrison, Chap. 19). This was 
one of the contributions that in 1978 earned 
Simon the Nobel Prize in Economics. 

In 1943, the British psychologist Kenneth 
Craik sketched the fundamental notion that 
a mental representation provides a kind of 
model of the world that can be “run” to make 


use a physical scale model of a bridge to 
anticipate the effects of stress on the ac- 
tual bridge intended to span a river).4 In the 
1960s and 1970s, modern work on the psy- 
chology of reasoning began in Britain with 
the contributions of Peter Wason and his col- 
laborator Philip Johnson-Laird (see Evans, 
Chap. 8). 

The modern conception of thinking as 
computation became prominent in the 
19708. In their classic treatment of human 
problem solving, Allen Newell and Herbert 
Simon (1972) showed that the computa- 
tional analysis of thinking (anticipated by 
Alan Turing, the father of computer science) 
could yield important empirical and theo- 
retical results. Like a program running on a 
digital computer, a person thinking through 
a problem can be viewed as taking an in- 
put that represents initial conditions and a 
goal, and applying a sequence of operations 
to reduce the difference between the initial 
conditions and the goal. The work of Newell 
and Simon established computer simulation 
as a standard method for analyzing human 
thinking. Their work also highlighted the po- 
tential of production systems (see Novick & 
Bassok, Chap. 14), which were subsequently 
developed extensively as cognitive models 
by John Anderson and his colleagues (see 
Lovett & Anderson, Chap. 17). 

The 1970s saw a wide range of major de- 
velopments that continue to shape the field. 
Eleanor Rosch, building on earlier work by 
Jerome Bruner (Bruner, Goodnow, & Austin, 
1956), addressed the fundamental question 
of why people have the categories they do, 
and not other logically possible groupings of 
objects (see Medin & Rips, Chap. 3). Rosch 
argued that natural categories often have 
fuzzy boundaries (a whale is an odd mam- 
mal) but nonetheless have clear central ten- 
dencies or prototypes (people by and large 
agree that a bear makes a fine mammal). 
The psychology of human judgment was re- 
shaped by the insights of Amos Tversky and 
Daniel Kahneman, who identified simple 
cognitive strategies, or heuristics, that people 
use to make judgments of frequency and 
probability. Often quick and accurate, these 
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to nonnormative judgments. After Tversky’s 
death in 1996, this line of work was con- 
tinued by Kahneman, who was awarded the 
Nobel Prize in Economics in 2002. The cur 
rent view of judgment, which has emerged 
from 30 years of research, is summarized by 
Kahneman and Frederick (Chap. 12; also see 
LeBoeuf & Shafir, Chap. 11). (Goldstone and 
Son, Chap. 2, review Tversky’s influential 
theory of similarity judgments.) 

In 1982, a young vision scientist, David 
Marr, published a book called Vision. Largely 
a technical treatment of visual perception, 
the book includes an opening chapter that 
lays out a larger vision — a vision of how 
the science of mind should proceed. Marr 
distinguished three levels of analysis, which 
he termed the level of computation, the level 
of representation and algorithm, and the level 
of implementation. Each level, according to 
Marr, addresses different questions, which 
he illustrated with the example of a phys- 
ical device, the cash register. At Marr’s most 
abstract level, computation (not to be con- 
fused with computation of an algorithm on a 
computer), the basic questions are “What is 
the goal that the cognitive process is meant 
to accomplish?” and “What is the logic of the 
mapping from the input to the output 
that distinguishes this mapping from other 
input-output mappings?” A cash register, 
viewed at this level, is used to achieve the 
goal of calculating how much is owed for a 
purchase. This task maps precisely onto the 
axioms of addition (e.g., the amount owed 
should not vary with the order in which 
items are presented to the sales clerk, a 
constraint that precisely matches the com- 
mutativity property of addition). It follows 
that, without knowing anything else about 
the workings of a particular cash register, 
we can be sure (if it is working prop- 
erly) that it will be performing addition 
(not division). 

The level of representation and algo- 
rithm, as the name implies, deals with the 
questions, “What is the representation of 
the input and output?” and “What is the 
algorithm for transforming the former into 
the latter?” Within a cash register, addition 


ther decimal or binary code, starting with 
either the leftmost or rightmost digit. Fi- 
nally, the level of implementation addresses 
the question, “How are the representation 
and algorithm realized physically?” The cash 
register could be implemented as an elec- 
tronic calculator, a mechanical adding ma- 
chine, or even a mental abacus in the mind of 
the clerk. 

In his book, Marr stressed the importance 
of the computational level of analysis, ar- 
guing that it could be seriously misleading 
to focus prematurely on the more concrete 
levels of analysis for a cognitive task with- 
out understanding the goal or nature of the 
mental computation.> Sadly, Marr died of 
leukemia before Vision was published, and 
so we do not know how his thinking about 
levels of analysis might have evolved. In 
very different ways, Marr’s conception of a 
computational level of analysis is reflected 
in several chapters in this book (see espe- 
cially Doumas & Hummel, Chap. 4; Buehner 
& Cheng, Chap. 7; Lovett & Anderson, 
Chap. 17). 

In the most recent quarter-century, many 
other springs of research have fed into the 
river of thinking and reasoning, including 
the field of analogy (see Holyoak, Chap. 6), 
neural network models (see Doumas & 
Hummel, Chap. 4; Halford, Chap. 22), and 
cognitive neuroscience (see Goel, Chap. 20). 
The chapters of this handbook collectively 
paint a picture of the state of the field at the 
dawn of the new millennium. 


Overview of the Handbook 


This volume brings together the contribu- 
tions of many of the leading researchers 
in thinking and reasoning to create the 
most comprehensive overview of research 
on thinking and reasoning that has ever been 
available. Each chapter includes a bit of his- 
torical perspective on the topic and ends 
with some thoughts about where the field 
seems to be heading. The book is organized 
into seven sections. 


6 THE CAMBRIDGE HANDBOOK OF THINKING AND REASONING 


Part I: The NeadveafradnitpsndasiianaryRanill: Judgment and Decision Making 


The three chapters in Part I address foun- 
dational issues related to the representation 
of human concepts. Chapter 2 by Gold- 
stone and Son reviews work on the core 
concept of similarity — how people assess 
the degree to which objects or events are 
alike. Chapter 3 by Medin and Rips consid- 
ers research on categories and how concepts 
are organized in semantic memory. Think- 
ing depends not only on representations of 
individual concepts, such as dogs and cats, 
but also on representations of the relation- 
ships among concepts, such as the fact that 
dogs often chase cats. In Chapter 4, Doumas 
and Hummel evaluate different compu- 
tational approaches to the representation 
of relations. 


Part II: Reasoning 


Chapters 5 to 10 deal with varieties of 
the core topic of reasoning. In Chapter 5, 
Sloman and Lagnado set the stage by lay- 
ing out the issues surrounding induction — 
using what is known to generate plausi- 
ble, although uncertain, inferences. Then, 
in Chapter 6, Holyoak reviews the liter- 
ature on reasoning by analogy, an impor- 
tant variety of inductive reasoning that is 
critical for learning. The most classic as- 
pect of induction is the way in which hu- 
mans and other creatures acquire knowledge 
about causal relations, which is critical for 
predicting the consequences of actions and 
events. In Chapter 7, Buehner and Cheng 
discuss research and theory on causal learn- 
ing. Then, in Chapter 8, Evans reviews work 
on the psychology of deductive reasoning, 
the form of thinking with the closest ties 
to logic. In Chapter 9, Johnson-Laird de- 
scribes the work that he and others have 
performed using the framework of men- 
tal models to deal with various reasoning 
tasks, both deductive and inductive. Men- 
tal models have close connections to percep- 
tual representations that are visuospatial in 
Chapter 10, Barbara Tversky reviews work 
on the role of visuospatial representations 
in thinking. 


We then turn to topics related to judgment 
and decision making. In Chapter 11, LeBoeuf 
and Shafir set the stage with a general re- 
view of work on decision making. Then, 
in Chapter 12, Kahneman and Frederick 
present an overarching model of heuristic 
judgment. In Chapter 13, Molden and Hig- 
gins review research revealing the ways in 
which human motivation and emotion influ- 
ence judgment. 


Part IV: Problem Solving 
and Complex Learning 


The five chapters that comprise this section 
deal with problem solving and allied issues 
concerning how people learn in problem- 
solving situations. In Chapter 14, Novick 
and Bassok provide a general overview of 
the field of human problem solving. Prob- 
lem solving has close connections to the 
topic of creativity, the focus of Chapter 15 
by Sternberg, Lubart, Kaufman, and Pretz. 
Beyond relatively routine problem solving, 
there are occasions when people need to re- 
structure their knowledge in complex ways 
to generate deeper understanding. How such 
complex learning takes place is the topic of 
Chapter 16 by Chi and Ohlsson. In Chap- 
ter 17, Lovett and Anderson review work 
on thinking that is based on a particular 
formal approach rooted in work on prob- 
lem solving, namely, production systems. 
Finally, in Chapter 18, Litman and Reber 
consider research suggesting that some as- 
pects of thinking and learning depend on im- 
plicit mechanisms that operate largely out- 
side of awareness. 


Part V: Cognitive and Neural Constraints 
on Human Thought 


High-level human thinking cannot be fully 
understood in isolation from fundamental 
cognitive processes and their neural sub- 
strates. In Chapter 19, Morrison reviews the 
wealth of evidence indicating that thinking 
and reasoning depend critically on what is 
known as “working memory,” that is, the sys- 
tem responsible for short-term maintenance 
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work is making headway in linking thought 
processes to specific brain structures such as 
the prefrontal cortex; in Chapter 20, Goel 
discusses the key topic of deductive reason- 
ing in relation to its neural substrate. Brain 
disorders, notably schizophrenia, produce 
striking disruptions of normal thought pro- 
cesses, which can shed light on how thinking 
takes place in normal brains. In Chapter 21, 
Bachman and Cannon review research and 
theory concerning thought disorder. 


Part VI: Ontogeny, Phylogeny, Language, 
and Culture 


Our understanding of thinking and reason- 
ing would be gravely limited if we restricted 
investigation to young adult English speak- 
ers. The six chapters in Part VI deal with the 
multifaceted ways in which aspects of think- 
ing vary across the human lifespan, across 
species, across speakers of different lan- 
guages, and across cultures. In Chapter 22, 
Halford provides an overview of the devel- 
opment of thinking and reasoning over the 
course of childhood. In Chapter 23, Gallistel 
and Gelman discuss mathematical thinking, 
a special form of thinking found in rudi- 
mentary form in nonhuman animals that un- 
dergoes development in children. In Chap- 
ter 24, Salthouse describes the changes in 
thinking and reasoning brought on by the 
aging process. The phylogeny of thinking — 
thinking and reasoning as performed by apes 
and monkeys — is discussed in Chapter 25 by 
Call and Tomasello. One of the most contro- 
versial topics in the field is the relationship 
between thinking and the language spoken 
by the thinker; in Chapter 26, Gleitman 
and Papafragou review the hypotheses and 
evidence concerning the connections be- 
tween language and thought. In Chapter 27, 
Greenfield considers the ways in which 
modes of thinking may vary in the context 
of different human cultures. 


Part VII: Thinking in Practice 


In cultures ancient and modern, thinking 
is put to particular use in special cultural 
practices. Moreover, there are individual dif- 


man thinking. This section includes three 
chapters focusing on thinking in particu- 
lar practices and two chapters that deal 
with variations in thinking ability. In Chap- 
ter 28, Ellsworth reviews what is known 
about thinking in the field of law. In Chap- 
ter 29, Dunbar and Fugelsang discuss think- 
ing and reasoning as manifested in the prac- 
tice of science. In Chapter 30, Patel, Arocha, 
and Zhang discuss reasoning in a field — 
medicine — in which accurate diagnosis and 
treatment are literally everyday matters of 
life and death. Then, in Chapter 31, Stern- 
berg reviews work on the concept of intel- 
ligence as a source of individual differences 
in thinking and reasoning. Finally, Chapter 
32 by Ritchhart and Perkins concludes the 
volume by reviewing one of the major chal- 
lenges for education — finding ways to teach 
people to think more effectively. 


Examples of Chapter Assignments 
for a Variety of Courses 


This volume offers a comprehensive treat- 
ment of higher cognition. As such, it serves 
as an excellent source for courses on think- 
ing and reasoning, both at the graduate 
level and for upper-level undergraduates. Al- 
though instructors for semester-length grad- 
uate courses in thinking and reasoning may 
opt to assign the entire volume as a text- 
book, there are a number of other possibili- 
ties (including using chapters from this vol- 
ume as introductions for various topics and 
then supplementing with readings from the 
primary literature). Here are a few examples 
of possible chapter groupings tailored to a 
variety of possible course offerings: 


Introduction to Thinking and Reasoning 


Chapter 1 Thinking and Reasoning: A 
Reader’s Guide 

Chapter 2 Similarity 

Chapter 3 Concepts and Categories: 


Memory, Meaning, and 
Metaphysics 
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Chapter 5 
Chapter 6 
Chapter 7 
Chapter 8 
Chapter 9 
Chapter 10 
Chapter 11 
Chapter 12 


Chapter 1 4 
Chapter 1 5 
Chapter 16 


Chapter 18 
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Analogy 

Causal Learning 

Deductive Reasoning 
Mental Models and Thought 
Visuospatial Reasoning 
Decision Making 


A Model of Heuristic 
Judgment 


Problem Solving 
Creativity 

Complex Declarative 
Learning 

Implicit Cognition and 
Thought 


Development of Thinking 


Chapter 2 Similarity 

Chapter 3 Concepts and Categories: 
Memory, Meaning, and 
Metaphysics 

Chapter 22 Development of Thinking 

Chapter 23 Mathematical Thinking 

Chapter 26 Language and Thought 

Chapter24 Effects of Aging on Reasoning 

Chapter25 Reasoning and Thinking in 
Nonhuman Primates 

Chapterig Thinking in Working Memory 

Chapter 31 Intelligence 

Chapter 32 Learning to Think: The 
Challenges of Teaching 
Thinking 

Modeling Human Thought 

Chapter 2 Similarity 

Chapter 3 Concepts and Categories: 
Memory, Meaning, and 
Metaphysics 

Chapter4 Approaches to Modeling 


Chapter 6 


Human Mental 
Representations: 

What Works, What Doesn't, 
and Why 


Analogy 


Causal Learning 


Chapter9 Mental Models and Thought 

Chapter 22 Development of Thinking 

Chapteri7 Thinking as a Production 
System 

Applied Thought 

Chapteri4 Problem Solving 

Chapterio Visuospatial Reasoning 

Chapter 23 Mathematical Thinking 

Chapter26 Language and Thought 

Chapteri5 Creativity 

Chapter 31 Intelligence 

Chapteri3 Motivated Thinking 

Chapter27 Paradigms of Cultural 
Thought 

Chapter16 Complex Declarative 
Learning 

Chapteri8 Implicit Cognition and 
Thought 

Chapter28 Legal Reasoning 

Chapter29 Scientific Thinking and 
Reasoning 

Chapter 30 Reasoning in Medicine 


Differences in Thought 


Chapter 31 
Chapter 1 5 
Chapter 1 9 
Chapter 21 


Chapter 22 
Chapter 25 


Chapter 2 4 
Chapter 26 


Chapter 1 3 
Chapter 27 


Chapter 29 


Chapter 32 


Intelligence 

Creativity 

Thinking in Working Memory 
Cognitive and Neuroscience 
Aspects of Thought Disorder 
Development of Thinking 
Reasoning and Thinking in 
Nonhuman Primates 

Effects of Aging on Reasoning 
Language and Thought 
Motivated Thinking 
Paradigms of Cultural 
Thought 

Scientific Thinking and 
Reasoning 

Learning to Think: The 


Challenges of Teaching 
Thinking 
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Notes 


1. Notice the linguistic connection between 
“thinking” and “seeing,” and thought and per 
ception, which was emphasized by the Gestalt 
psychologists of the early twentieth century. 

2. The distinction between deduction and in- 
duction blurs in the study of the psychol- 
ogy of thinking, as we see in Part II of 
this volume. 

3. There are echoes of the old meaning of 
“reckon” in such phrases as “reckon the cost.” 
As a further aside, the term “dead reckon- 
ing,” a procedure for calculating the position 
of a ship or aircraft, derives from “deduc- 


hero in a tough spot might venture, “I reckon 
we can hold out till sun-up,” illustrating how 
calculation has crossed over to become a 
metaphor for mental judgment. 

4. See Johnson-Laird, Chap. 9, for a current view 
of thinking and reasoning that owes much to 
Craik’s seminal ideas. 

5. Indeed, Marr criticized Newell and Simon’s 
approach to problem solving for paying insuf- 
ficient attention to the computational level in 
his sense. 
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CHAPTER 2 


Similarity 


Robert L. Goldstone 
Ji Yun Son 


Introduction 


Human assessments of similarity are funda- 
mental to cognition because similarities in 
the world are revealing. The world is an or 
derly enough place that similar objects and 
events tend to behave similarly. This fact 
of the world is not just a fortunate coinci- 
dence. It is because objects are similar that 
they will tend to behave similarly in most 
respects. It is because crocodiles and alliga- 
tors are similar in their external form, in- 
ternal biology, behavior, diet, and customary 
environment that one can often successfully 
generalize from what one knows of one to 
the other. As Quine (1969) observed, “Sim- 
ilarity, is fundamental for learning, knowl- 
edge and thought, for only our sense of sim- 
ilarity allows us to order things into kinds 
so that these can function as stimulus mean- 
ings. Reasonable expectation depends on the 
similarity of circumstances and on our ten- 
dency to expect that similar causes will have 
similar effects” (p. 114). Similarity thus plays 
a crucial role in making predictions because 
similar things usually behave similarly. 


From this perspective, psychological as- 
sessments of similarity are valuable to the 
extent that they provide grounds for predict- 
ing as many important aspects of our world 
as possible (Holland, Holyoak, Nisbett, 
& Thagard, 1986; see Dunbar & Fugelsang, 
Chap. 29). Appreciating the similarity be- 
tween crocodiles and alligators is helpful 
because information learned about one is 
generally true of the other. If we learned an 
arbitrary fact about crocodiles, such as they 
are very sensitive to the cold, then we could 
probably infer that this fact is also true of 
alligators. As the similarity between A and 
B increases, so does the probability of cor- 
rectly inferring that B has X upon knowing 
that A has X (Tenenbaum, 1999). This re- 
lation assumes we have no special knowl- 
edge related to property X. Empirically, Heit 
and Rubinstein (1994) showed that if we do 
know about the property, then this knowl- 
edge, rather than a one-size-fits-all similarity, 
is used to guide our inferences. For example, 
if people are asked to make an inference 
about an anatomical property, then anatom- 
ical similarities have more influence than 
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cally but not behaviorially similar to pigs, 
and this difference successfully predicts that 
people are likely to make anatomical but 
not behavioral inferences from pigs to boars. 
The logical extreme of this line of reason- 
ing (Goodman, 1972; Quine, 1977) is that if 
one has complete knowledge about the rea- 
sons why an object has a property, then gen- 
eral similarity is no longer relevant to gener- 
alizations. The knowledge itself completely 
guides whether the generalization is appro- 
priate. Moonbeams and melons are not very 
similar generally speaking, but if one is told 
that moonbeams have the property that the 
word begins with Melanie’s favorite letter, 
then one can generalize this property to mel- 
ons with very high confidence. 

By contrasting the cases of crocodiles, 
boars, and moonbeams, we can specify the 
benefits and limitations of similarity. We 
tend to rely on similarity to generate in- 
ferences and categorize objects into kinds 
when we do not know exactly what prop- 
erties are relevant or when we cannot eas- 
ily separate an object into separate proper 
ties. Similarity is an excellent example of a 
domain-general source of information. Even 
when we do not have specific knowledge 
of a domain, we can use similarity as a de- 
fault method to reason about it. The contra- 
vening limitation of this domain generality 
is that when specific knowledge is available, 
then a generic assessment of similarity is 
no longer as relevant (Keil, 1989; Murphy, 
2002; Murphy & Medin, 1985; Rips, 1989; 
Rips & Collins, 1993). Artificial laboratory 
experiments in which subjects are asked to 
categorize unfamiliar stimuli into novel cat- 
egories invented by the experimenter are sit- 
uations in which similarity is clearly impor 
tant because subjects have little else to use 
(Estes, 1994; Nosofsky, 1984, 1986). How- 
ever, similarity is also important in many 
real world situations because our knowledge 
does not run as deep as we think it does 
(Rozenblit & Keil, 2002) and because a gen- 
eral sense of similarity often has an influence 
even when more specific knowledge ought 
to overrule it (Allen & Brooks, 1991; Smith & 
Sloman, 1994). 


similarity in cognition is simply that it plays 
a significant role in psychological accounts 
of problem solving, memory, prediction, 
and categorization. If a problem is similar 
to a previously solved problem, then the 
solution to the old problem may be applied 
to the new problem (Holyoak & Koh, 1987; 
Ross, 1987, 1989). If a cue is similar enough 
to a stored memory, the memory may be 
retrieved (Raaijmakers & Shiffrin, 1981). 
If an event is similar enough to a previ- 
ously experienced event, the stored event’s 
outcome may be offered as a candidate pre- 
diction for the current event (Sloman, 1993; 
Tenenbaum & Griffiths, 2001). If an un- 
known object is similar enough to a known 
object, then the known object’s category 
label may be applied to the unknown object 
(Nosofsky, 1986). The act of comparing 
events, objects, and scenes and establishing 
similarities between them is of critical 
importance for the cognitive processes we 
depend on. 

The utility of similarity for ground- 
ing our concepts has been rediscovered 
in all the fields comprising cognitive sci- 
ence (see Medin & Rips, Chap. 3). Exem- 
plar (Estes, 1994; Kruschke, 1992; Lamberts, 
2000; Medin & Schaffer, 1978; Nosofsky, 
1986), instance-based (Aha, 1992), view- 
based (Tarr & Gauthier, 1998), case-based 
(Schank, 1982), nearest neighbor (Ripley, 
1996), configural cue (Gluck & Bower, 
1990), and vector quantization (Kohonen, 
1995) models share the underlying strat- 
egy of giving responses learned from similar, 
previously presented patterns to novel pat- 
terns. Thus, a model can respond to rep- 
etitions of these patterns; it can also give 
responses to novel patterns that are likely 
to be correct by sampling responses to 
old patterns weighted by their similar 
ity to the novel pattern. Consistent with 
these models, psychological evidence sug- 
gests that people show good transfer to 
new stimuli in perceptual tasks to the ex- 
tent that the new stimuli resemble previ- 
ously learned stimuli (Kolers & Roediger, 
1984; Palmeri, 1997). Another common 
feature of these approaches is that they 
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processed form. This parallels the constraint 
described previously on the applicability 
of similarity. Both raw representations and 
generic similarity assessments are most use- 
ful as a default strategy when one does not 
know exactly what properties of a stimulus 
are important. One’s best bet is to follow the 
principle of least commitment (Marr, 1982) 
and keep mental descriptions in a relatively 
raw form to preserve information that may 
be needed at a later point. 

Another reason for studying similarity is 
that it provides an elegant diagnostic tool 
for examining the structure of our mental 
entities and the processes that operate on 
them. For example, one way to tell that a 
physicist has progressed beyond the novice 
stage is that he or she sees deep similari- 
ties between problems that require calcu- 
lation of force even though the problems 
are superficially dissimilar (Chi, Feltovich, 
& Glaser, 1981; see Novick & Bassok, Chap. 
14). Given that psychologists have no mi- 
croscope with direct access to people’s rep- 
resentations of their knowledge, appraisals 
of similarity provide a powerful, if indirect, 
lens onto representation/process assemblies 
(see also Doumas & Hummel, Chap. 4). 

A final reason to study similarity is 
that it occupies an important ground be- 
tween perceptual constraints and higher- 
level knowledge system functions. Similar 
ity is grounded by perceptual functions. A 
tone of 200 Hz and a tone of 202 Hz sound 
similar (Shepard, 1987), and the similar- 
ity is cognitively impenetrable (Pylyshyn, 
1985) — enough that there is little that 
can be done to alter this perceived similar- 
ity. However, similarity is also highly flexi- 
ble and dependent on knowledge and pur 
pose. By focusing on patterns of motion 
and relations, even electrons and planets can 
be made to seem similar (Gentner, 1983; 
Holyoak & Thagard, 1989; see Holyoak, 
Chap. 6). A complete account of similar- 
ity will make contact both with Fodor’s 
(1983) isolated and modularized percep- 
tual input devices and the “central system” 
in which everything a person knows may 
be relevant. 


to Similarity 


There have been a number of formal treat- 
ments that simultaneously provide theoreti- 
cal accounts of similarity and describe how it 
can be empirically measured (Hahn, 2003). 
These models have had a profound practical 
impact in statistics, automatic pattern recog- 
nition by machines, data mining, and mar- 
keting (e.g., online stores can provide “peo- 
ple similar to you liked the following other 
items...”). Our brief survey is organized in 
terms of the following models: geometric, 
feature based, alignment based, and trans- 
formational. 


Geometric Models and 
Multidimensional Scaling 


Geometric models of similarity have been 
among the most influential approaches to 
analyzing similarity (Carroll & Wish, 1974; 
Torgerson, 1958, 1965). These approaches 
are exemplified by nonmetric multidimen- 
sional scaling (MDS) models (Shepard, 
1962a, 1962b). MDS models represent sim- 
ilarity relations between entities in terms 
of a geometric model that consists of a set 
of points embedded in a dimensionally or- 
ganized metric space. The input to MDS 
routines may be similarity judgments, dis- 
similarity judgments, confusion matrices, 
correlation coefficients, joint probabilities, 
or any other measure of pairwise proximity. 
The output of an MDS routine is a geomet- 
ric model of the data, with each object of 
the data set represented as a point in an n- 
dimensional space. The similarity between a 
pair of objects is taken to be inversely related 
to the distance between two objects’ points 
in the space. In MDS, the distance between 
points i and j is typically computed by 


1 


dissimilarity(i, j) = bp Xie — xa 


k=1 
(2.1) 


where n is the number of dimensions, Xj 
is the value of dimension k for item i, and r 
is a parameter that allows different spatial 
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dard Euclidean notion of distance is invoked, 
whereby the distance between two points 
is the length of the straight line connect- 
ing the points. If r =1, then distance in- 
volves a city-block metric where the dis- 
tance between two points is the sum of 
their distances on each dimension (“short- 
cut” diagonal paths are not allowed to di- 
rectly connect points differing on more than 
one dimension). An Euclidean metric of- 
ten provides a better fit to empirical data 
when the stimuli being compared are com- 
posed of integral, perceptually fused dimen- 
sions such as the brightness and saturation 
of a color. Conversely, a city-block metric is 
often appropriate for psychologically sepa- 
rated dimensions such as brightness and size 
(Attneave, 1950). 

Richardson’s (1938) fundamental insight, 
which is the basis of contemporary use of 
MDS, was to begin with subjects’ judgments 
of pairwise object dissimilarity and work 
backward to determine the dimensions and 
dimension values that subjects used in mak- 
ing their judgments. MDS algorithms pro- 
ceed by placing entities in an n-dimensional 
space such that the distances between the 
entities accurately reflect the empirically ob- 
served similarities. For example, if we asked 
people to rate the similarities [on a scale 
from 1 (low similarity) to 10 (high similar- 
ity)] of Russia, Cuba, and Jamaica, we might 


find 


Similarity (Russia, Cuba) = 7 
Similarity (Russia, Jamaica) = 1 
Similarity (Cuba, Jamaica) = 8 


An MDS algorithm would try to position the 
three countries in a space such that coun- 
tries that are rated as being highly similar 
are very close to each other in the space. 
With nonmetric scaling techniques, only or 
dinal similarity relations are preserved. The 
interpoint distances suggested by the simi- 
larity ratings may not be simultaneously sat- 
isfable in a given dimensional space. If we 
limit ourselves to a single dimension (we 
place the countries on a “number line”), then 


Cuba (similarity = 7) and place Russia far 
away from Jamaica (similarity = 1). In MDS 
terms, the “stress” of the one-dimensional so- 
lution would be high. We could increase the 
dimensionality of our solution and position 
the points in two-dimensional space. A per- 
fect reconstruction of any set of proximities 
among a set of n objects can be obtained if 
a high enough dimensionality (specifically, 
n —1 dimensions) is used. 

One of the main applications of MDS is to 
determine the underlying dimensions com- 
prising the set of compared objects. Once 
the points are positioned in a way that faith- 
fully mirrors the subjectively obtained simi- 
larities, it is often possible to give interpreta- 
tions to the axes or to rotations of the axes. 
In the previous example, dimensions may 
correspond to “political affiliation” and “cli- 
mate.” Russia and Cuba would have similar 
values on the former dimension; Jamaica and 
Cuba would have similar values on the lat- 
ter dimension. A study by Smith, Shoben, 
and Rips (1974) illustrates a classic use of 
MDS (Figure 2.1). They obtained similar- 
ity ratings from subjects on many pairs of 
birds. Submitting these pairwise similarity 
ratings to MDS analysis, they hypothesized 
underlying features that were used for rep- 
resenting the birds. Assigning subjective in- 
terpretations to the geometric model’s axes, 
the experimenters suggested that birds were 
represented in terms of their values on di- 
mensions such as “ferocity” and “size.” It is 
important to note that the proper psycholog- 
ical interpretation of a geometric represen- 
tation of objects is not necessarily in terms 
of its Cartesian axes. In some domains, such 
as musical pitches, the best interpretation 
of objects may be in terms of their polar 
coordinates of angle and length (Shepard, 
1982). More recent work has extended ge- 
ometric representations still further, repre- 
senting patterns of similarities by general- 
ized, nonlinear manifolds (Tenenbaum, De 
Silva, & Lanford, 2000). 

MDS is also used to create a compressed 
representation that conveys relative similar- 
ities among a set of items. A set of n items 
requires n(n—1)/2 numbers to express 
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Figure 2.1. Two multidimensional scaling (MDS) solutions for sets of birds (A) and animals (B). The 
distances between words in the MDS space reflect their psychology dissimilarity. Once an MDS 
solution has been made, psychological interpretations for the dimensions may be possible. In these 
solutions, the horizontal and vertical dimensions may represent size and domesticity, respectively. 
(Reprinted from Rips, Shoben, & Smith, 1974, by permission.) 


all pairwise distances among the items, if 
it is assumed that any object has a dis- 
tance of o to itself and distances are sym- 
metric. However, if an MDS solution fits 
the distance data well, it can allow these 
same distances to be reconstructed using 
only ND numbers, where D is the num- 
ber of dimensions of the MDS solution. 
This compression may be psychologically 
very useful. One of the main goals of psy- 
chological representation is to create effi- 
cient codes for representing a set of objects. 
Compressed representations can facilitate 
encoding, memory, and processing. Shimon 
Edelman (1999) proposed that both peo- 
ple and machines efficiently code their 
world by creating geometric spaces for ob- 
jects with much lower dimensionality than 
the objects’ physical descriptions (see also 
Gardenfors, 2000). 

A third use of MDS is to create quan- 
titative representations that can be used 
in mathematical and computational models 
of cognitive processes. Numeric representa- 
tions, namely coordinates in a psychologi- 
cal space, can be derived for stories, pic- 
tures, sounds, words, or any other stimuli 


for which one can obtain subjective sim- 
ilarity data. Once constructed, these nu- 
meric representations can be used to pre- 
dict people’s categorization accuracy, mem- 
ory performance, or learning speed. MDS 
models have been successful in express- 
ing cognitive structures in stimulus do- 
mains as far removed as animals (Smith, 
Shoben, & Rips, 1974), Rorschach ink blots 
(Osterholm, Woods, & Le Unes, 1985), 
chess positions (Horgan, Millis, & Neimeyer, 
1989), and air flight scenarios (Schvaneveldt, 
1985). Many objects, situations, and con- 
cepts seem to be psychologically structured 
in terms of dimensions, and a geomet- 
ric interpretation of the dimensional orga- 
nization captures a substantial amount of 
that structure. 


Featural Models 


In 1977, Amos Tversky brought into promi- 
nence what would become the main con- 
tender to geometric models of similarity in 
psychology. The reason given for propos- 
ing a feature-based model was that subjec- 
tive assessments of similarity did not always 
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of similarity. 


PROBLEMS WITH THE STANDARD 
GEOMETRIC MODEL 

Three assumptions of standard geometric 
models of similarity are 


Minimality: D(A,B) > D(A,A) = 0 

Symmetry: D(A,B) = D(B,A) 

Triangle Inequality: D(A,B) + D(B,C) => 
D(A,C) 


where D(A,B) is interpreted as the dissim- 
ilarity between items A and B. Accord- 
ing to the minimality assumption, all ob- 
jects are equally (dis)similar to themselves. 
Some violations of this assumption are found 
(Nickerson, 1972) when confusion rates or 
RT measures of similarity are used. First, 
not all letters are equally similar to them- 
selves. For example, in Podgorny and Gar- 
ner (1979), if the letter S is shown twice 
on a screen, subjects are faster to correctly 
say that the two tokens are similar (i.e., they 
come from the same similarity defined clus- 
ter) than if the twice-shown letter is W. By 
the reaction time measure of similarity, the 
letter S is more similar to itself than the let- 
ter W is to itself’ Even more troublesome 
for the minimality assumption, two differ 
ent letters may be more similar to each other 
than a particular letter is to itself. The letter 
C is more similar to the letter O than W is to 
itself, as measured by interletter confusions. 
In Gilmore, Hersh, Caramazza, and Griffin 
(1979), the letter M is more often recognized 
as an H (p = .391) than as an M (p = .180). 
This is problematic for geometric represen- 
tations because the distance between a point 
and itself should be zero. 

According to the symmetry assumption, 
(dis)similarity should not be affected by the 
ordering of items because the distance from 
point A to B is equal to the distance from 
B to A. Contrary to this presumed sym- 
metry, similarity is asymmetric on occasion 
(Tversky, 1977). In one of Tversky’s exam- 
ples, North Korea is judged to be more simi- 
lar to Red China than Red China is to North 
Korea. Often, a nonprominent item is more 


This is consistent with the result that peo- 
ple judge their friends to be more similar 
to themselves than they themselves are to 
their friends (Holyoak & Gordon, 1983), un- 
der the assumption that a person is highly 
prominent to him- or herself. More recently, 
Polk et al. (2002) found that when the fre- 
quency of colors is experimentally manipu- 
lated, rare colors are judged to be more sim- 
ilar to common colors than common colors 
are to rare colors. 

According to the triangle inequality as- 
sumption (Figure 2.2), the distance/ dissim- 
ilarity between two points A and B cannot 
be more than the distance between A and 
a third point C plus the distance between C 
and B. Geometrically speaking, a straight line 
connecting two points is the shortest path 
between the points. Tversky and Gati (1982) 
found violations of this assumption when it 
is combined with an assumption of segmen- 
tal additivity [D(A,B) + D(B,C) = D(A,C), 
if A, B, and C lie on a straight line]. Con- 
sider three items in multidimensional space, 
A, B, and C, falling on a straight line such 
that B is between A and C. Also consider 
a fourth point, E, that forms a right trian- 
gle when combined with A and C. The tri- 
angle inequality assumption cum segmental 
additivity predicts that 


D(A,E) > D(A,B) and D(E,C) > D(B,C) 
D(A,E) > D(B,C) and D(E,C) > D(A,B) 


Systematic violations of this prediction are 
found such that the path going through the 
corner point E is shorter than the path going 
through the center point B. For example, if 
the items are instantiated as 


A = White, 3 inches 
B = Pink, 4 inches 
C = Red, 5 inches 
E = Red, 3 inches 


then people’s dissimilarity ratings indi- 
cate that D(A,E) <D(A,B) and D(E,C) < 
D(B,C). Such an effect can be modeled by 
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Figure 2.2. The triangle inequality assumption 
requires the path from A to C going through B to 
be shorter than the path going through E. 


geometric models of similarity if rin Eq. 2.1 
is given a value less than 1. However, if r is 
less than 1, then dissimilarity does not satisfy 
a power metric, which is often considered a 
minimal assumption for geometric solutions 
to be interpretable. The two assumptions 
of a power metric are (i) distances along 
straight lines are additive, and (2) the short- 
est path between points is a straight line. 
Other potential problems with geometric 
models of similarity are (1) they strictly limit 
the number of nearest neighbors an item 
can have (Tversky & Hutchinson, 1986), (2) 
MDS techniques have difficulty describing 
items that vary on a large number of features 
(Krumhansl, 1978), and (3) standard MDS 
techniques do not predict that adding com- 
mon features to items increases their sim- 
ilarity (Tversky & Gati, 1982). On the first 
point, MDS models consisting of two dimen- 
sions cannot predict that item X is the clos- 
est item to 100 other items. There would be 
no way of placing those 100 items in two 
dimensions such that X would be closer to 
all of them than any other item. For human 
data, a superordinate term (e.g., fruit) is of- 
ten the nearest neighbor of many of its ex- 
emplars (apples, bananas, etc.), as measured 
by similarity ratings. On the second point, 
although there is no logical reason why ge- 
ometric models cannot represent items of 
any number of dimensions (as long as the 
number of dimensions is less than number of 
items minus one), geometric models tend to 


solutions in low-dimensional space. MDS so- 
lutions involving more than six dimensions 
are rare. On the third point, the addition of 
the same feature to a pair of items increases 
their rated similarity (Gati & Tversky, 
1984), but this is incompatible with sim- 
ple MDS models. If adding a shared feature 
corresponds to adding a dimension in which 
the two items under consideration have the 
same value, then there will be no change to 
the items’ dissimilarity because the geomet- 
ric distance between the points remains the 
same. MDS models that incorporate the di- 
mensionality of the space could predict the 
influence of shared features on similarity, but 
such a model would no longer relate similar- 
ity directly to an inverse function of inter- 
item distance. 

One research strategy has been to aug- 
ment geometric models of similarity in ways 
that solve these problems. One solution, sug- 
gested by Carol Krumhansl (1978), has been 
to model dissimilarity in terms of both inter- 
item distance in a multidimensional space 
and spatial density in the neighborhoods of 
the compared items. The more items there 
are in the vicinity of an item, the greater 
the spatial density of the item. Items are 
more dissimilar if they have many items sur- 
rounding them (their spatial density is high) 
than if they have few neighboring items. By 
including spatial density in an MDS analy- 
sis, violations of minimality, symmetry, and 
the triangle inequality can potentially be ac- 
counted for, as well as some of the influence 
of context on similarity. However, the em- 
pirical validity of the spatial density hypoth- 
esis is in some doubt (Corter, 1987, 1988; 
Krumhansl, 1988; Tversky & Gati, 1982). 

Robert Nosofsky (1991) suggested an- 
other potential way to save MDS models 
from some of the previous criticisms. He in- 
troduces individual bias parameters in addi- 
tion to the inter-item relation term. Sim- 
ilarity is modeled in terms of inter-item 
distance and biases toward particular items. 
Biases toward items may be due to 
attention, salience, knowledge, and fre- 
quency of items. This revision handles 
asymmetric similarity results and the result 
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THE CONTRAST MODEL 


In light of the previous potential prob- 
lems for geometric representations, Tversky 
(1977) proposed to characterize similarity in 
terms of a feature-matching process based 
on weighting common and distinctive fea- 
tures. In this model, entities are represented 
as a collection of features and similarity is 
computed by 


S(A, B) 


= 6f (ANB) —af(A—B) —bf(B—A) 


(2.2) 


The similarity of A to B is expressed as a 
linear combination of the measure of the 
common and distinctive features. The term 
(AN B) represents the features that items A 
and B have in common. (A —B) represents 
the features that A has but B does not. (B — 
A) represents the features of B that are not in 
A. 6, a, and b are weights for the common 
and distinctive components. Common fea- 
tures, as compared with distinctive features, 
are given relatively more weight for verbal as 
opposed to pictorial stimuli (Gati & Tversky, 
1984), cohesive as opposed to noncohesive 
stimuli (Ritov, Gati, & Tversky, 1990), sim- 
ilarity as opposed to difference judgments 
(Tversky, 1977), and entities with a large 
number of distinctive as opposed to com- 
mon features (Gati & Tversky, 1984). There 
are no restrictions on what may constitute a 
feature. A feature may be any property, char- 
acteristic, or aspect of a stimulus. Features 
may be concrete or abstract (i.e., “symmet- 
ric” or “beautiful’). 

The contrast model predicts asymmetric 
similarity because a is not constrained to 
equal b and f(A — B) may not equal f(B — 
A). North Korea is predicted to be more 
similar to Red China than vice versa if Red 
China has more salient distinctive features 
than North Korea, and a is greater than b. 
The contrast model can also account for 
nonmirroring between similarity and differ- 
ence judgments. The common features term 


ps SiAalaNeAry WOMB) is hypothesized to receive more 


weight in similarity than difference judg- 
ments; the distinctive features term receives 
relatively more weight in difference judg- 
ments. As a result, certain pairs of stimuli 
may be perceived as simultaneously being 
more similar to and more different from each 
other compared with other pairs (Tversky, 
1977). Sixty-seven percent of a group of sub- 
jects selected West Germany and East Ger- 
many as more similar to each other than 
Ceylon and Nepal. Seventy percent of sub- 
jects also selected West Germany and East 
Germany as more different from each other 
than Ceylon and Nepal. According to Tver- 
sky, East and West Germany have more 
common and more distinctive features than 
Ceylon and Nepal. Medin, Goldstone, and 
Gentner (1993) presented additional evi- 
dence for nonmirroring between similarity 
and difference, exemplified in Figure 2.3. 
When two scenes share a relatively large 
number of relational commonalities (e.g., 
scenes T and B both have three objects that 
have the same pattern), but also a large num- 
ber of differences on specific attributes (e.g., 
none of the patterns in scene T match any 
of the patterns in B), then the scenes tend 
to be judged as simultaneously very similar 
and very different. 

A number of models are similar to 
the contrast model in basing similarity 
on features and in using some combina- 
tion of the (ANB), (A—B), and (B—A) 
components. Sjoberg (1972) proposed that 
similarity is defined as f(ANMB)/f(A UB). 
Eisler and Ekman (i959) claimed that 
similarity is proportional to f(AMB)/ 
(f(A) + f(B)). Bush and Mosteller (1951) 
defined similarity as f(AMB)/f(A). These 
three models can all be considered spe- 
cializations of the general equation f(AN 
B)/[f(A UB) +af(A —B)+bf(B—A)]. As 
such, they differ from the contrast model 
by applying a ratio function as opposed to 
a linear contrast of common and distinc- 
tive features. 

The fundamental premise of the con- 
trast model, that entities can be described 
in terms of constituent features, is a pow- 
erful idea in cognitive psychology. Featural 
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Figure 2.3. The set of objects in B is selected as both more similar to, and more 
different from, the set of objects in T relative to the set of objects in A. From 
Medin, Goldstone, and Gentner (1990). Reprinted by permission. 


analyses have proliferated in domains of 
speech perception (Jakobson, Fant, & Halle, 
1963), pattern recognition (Neisser, 1967; 
Treisman, 1986), perception physiology 
(Hubel & Wiesel, 1968), semantic content 
(Katz & Fodor, 1963), and categorization 
(Medin & Schaffer, 1978; see Medin & Rips, 
Chap. 3). Neural network representations 
are often based on features, with entities be- 
ing broken down into a vector of ones and 
zeros, where each bit refers to a feature or 
“microfeature.” Similarity plays a crucial role 
in many connectionist theories of generaliza- 
tion, concept formation, and learning. The 
notion of dissimilarity used in these systems 
is typically the fairly simple function “Ham- 
ming distance.” The Hamming distance be- 
tween two strings is simply their city- 
block distance; that is, it is their (A — B) + 
(B—A) term. “1 0 011” and “11111” 
would have a Hamming distance of 2 be- 
cause they differ on two bits. Occasionally, 
more sophisticated measures of similarity 


in neural networks normalize dissimilarities 
by string length. Normalized Hamming dis- 
tance functions can be expressed by [(A — 


B) + (B — A)]/[f(AMB)]. 


SIMILARITIES BETWEEN GEOMETRIC 
AND FEATURE-BASED MODELS 


Although MDS and featural models are of- 
ten analyzed in terms of their differences, 
they also share a number of similarities. 
More recent progress has been made on 
combining both representations into a sin- 
gle model, using Bayesian statistics to deter- 
mine whether a given source of variation is 
more efficiently represented as a feature or 
dimension (Navarro & Lee, 2003). Tversky 
and Gati (1982) described methods of trans- 
lating continuous dimensions into featural 
representations. Dimensions that are sensi- 
bly described as being more or less (e.g., loud 
is more sound than soft, bright is more light 
than dim, and large is more size than small) 
can be represented by sequences of nested 
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a subset of A’s features whenever B is louder, 
brighter, or larger than A. Alternatively, for 
qualitative attributes such as shape or hue 
(red is not subjectively “more” than blue), 
dimensions can be represented by chains of 
features such that if B is between A and 
C on the dimension, then (ANB) D (ANC) 
and (BNC)D(ANC). For example, if or 
ange lies between red and yellow on the hue 
dimension, then this can be featurally repre- 
sented if orange and red share features that 
orange and yellow do not share. 

An important attribute of MDS mod- 
els is that they create postulated representa- 
tions, namely dimensions, that explain the 
systematicities present in a set of similar- 
ity data. This is a classic use of abductive 
reasoning; dimensional representations are 
hypothesized that, if they were to exist, 
would give rise to the obtained similarity 
data. Other computational techniques share 
with MDS the goal of discovering the un- 
derlying descriptions for items of interest 
but create featural rather than dimensional 
representations. Hierarchical cluster analy- 
sis, such as MDS, takes pairwise proximity 
data as input. Rather than output a geo- 
metric space with objects as points, hierar- 
chical cluster analysis outputs an inverted- 
tree diagram with items at the root-level 
connected with branches. The smaller the 
branching distance between two items, the 
more similar they are. Just as the dimen- 
sional axes of MDS solutions are given sub- 
jective interpretations, the branches are also 
given interpretations. For example, in Shep- 
ard’s (1972) analysis of speech sounds, one 
branch is interpreted as voiced phonemes, 
whereas another branch contains the un- 
voiced phonemes. In additive cluster analysis 
(Shepard & Arabie, 1979), similarity data are 
transformed into a set of overlapping item 
clusters. Items that are highly similar will 
tend to belong to the same clusters. Each 
cluster can be considered as a feature. More 
recent progress has been made on efficient 
and mathematically principled models that 
find such featural representations for large 
databases (Lee, 2002a, 2002b; Tenenbaum, 


1996). 


ric and featural representations, one that 
motivates the next major class of similar- 
ity models that we consider, is that both 
use relatively unstructured representations. 
Entities are structured as sets of features 
or dimensions with no relations between 
these attributes. Entities such as stories, sen- 
tences, natural objects, words, scientific the- 
ories, landscapes, and faces are not sim- 
ply a “grab bag” of attributes. Two kinds 
of structure seem particularly important: 
propositional and hierarchical. A proposi- 
tion is an assertion about the relation be- 
tween informational entities (Palmer, 1975). 
For example, relations in a visual domain 
might include above, near, right, inside, and 
larger than, which take informational enti- 
ties as arguments. The informational enti- 
ties might include features such as square 
and values on dimensions such as 3 inches. 
Propositions are defined as the smallest unit 
of knowledge that can stand as a separate 
assertion and have a truth value. The or- 
der of the arguments in the predicate is 
critical. For example, above (triangle, circle) 
does not represent the same fact as above 
(circle, triangle). Hierarchical representations 
involve entities that are embedded in one 
another. Hierarchical representations are re- 
quired to represent the fact that X is part of 
Y or that X is a kind of Y. For example, in 
Collins and Quillian’s (1969) propositional 
networks, labeled links (“Is-a” links) stand for 
the hierarchical relation between canary and 
bird. 

Some quick fixes to geometric and feat- 
ural accounts of similarity are possible, but 
they fall short of a truly general capacity to 
handle structured inputs. Hierarchical clus- 
tering does create trees of features, but there 
is no guarantee that there are relationships, 
such as Is-a or Part-of, between the subtrees. 
However, structure might exist in terms of 
features that represent conjunctions of prop- 
erties. For example, using the materials in 
Figure 2.4, 20 undergraduates were shown 
triads consisting of A, B, and T and were 
asked to say whether scene A or B was more 
similar to T. The strong tendency to choose 
A over B in the first panel suggests that 


SIMILARITY 23 


“CSO BN 


BP go 


A 


Te 


\ 
Ea 


B 


wy qe BO NSF 


Figure 2.4. The sets of objects T are typically judged to be more similar to the 
objects in the A sets than the B sets. These judgments show that people pay 
attention to more than just simple properties such as “black” or “square” when 


comparing scenes. 


the feature “square” influences similarity. 
Other choices indicated that subjects also 
based similarity judgments on the spatial lo- 
cations and shadings of objects as well as 
their shapes. 

However, it is not sufficient to represent 
the leftmost object of T as fleft, square, 
black} and base similarity on the number of 
shared and distinctive features. In the second 
panel, A is again judged to be more simi- 
lar to T than is B. Both A and B have the 
features “black” and “square.” The only dif- 
ference is that for A and T, but not B, the 
“black” and “square” features belong to the 
same object. This is only compatible with 
feature set representations if we include the 
possibility of conjunctive features in addition 
to simple features such as “black” and “square” 
(Gluck, 1991; Hayes-Roth & Hayes-Roth, 
1977). By including the conjunctive feature 
“black-square,” possessed by both T and A, 
we can explain, using feature sets, why T is 
more similar to A than B. The third panel 
demonstrates the need for a “black-left” fea- 
ture, and other data indicate a need for a 
“square-left” feature. Altogether, if we want 
to explain the similarity judgments that peo- 
ple make, we need a feature set representa- 
tion that includes six features (three simple 


and three complex) to represent the square 
of T. 


However, there are two objects in T, 
bringing the total number of features re- 
quired to at least two times the six features 
required for one object. The number of fea- 
tures required increases still further if we 
include feature triplets such as “left-black- 
square.” In general, if there are O objects 
in a scene and each object has F features, 
then there will be OF simple features. There 
will be O conjunctive features that combine 
two simple features (i.e., pairwise conjunc- 
tive features). If we limit ourselves to simple 
and pairwise features to explain the pattern 
of similarity judgments in Figure 2.3, we still 
will require OF(F +1)/2 features per scene, 
or OF(F +1) features for two scenes that are 
compared with one another. 

Thus, featural approaches to similarity re- 
quire a fairly large number of features to rep- 
resent scenes that are organized into parts. 
Similar problems exist for dimensional ac- 
counts of similarity. The situation for these 
models becomes much worse when we con- 
sider that similarity is also influenced by re- 
lations between features such as “black to 
the left of white” and “square to the left 
of white.” Considering only binary relations, 
there are O?F?R -OFR relations within a 
scene that contains O objects, F features 
per object, and R different types of re- 
lations between features. Although more 
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about these approaches by Hummel and col- 
leagues (Holyoak & Hummel, 2000; Hum- 
mel, 2000, 2001; Hummel & Biederman, 
1992; Hummel & Holyoak, 1997, 2003; see 
Doumas & Hummel, Chap. 4), at the very 
least, geometric and featural models appar- 
ently require an implausibly large number of 
attributes to account for the similarity rela- 
tions between structured, multipart scenes. 


Alignment-Based Models 


Partly in response to the difficulties that the 
previous models have in dealing with struc- 
tured descriptions, a number of researchers 
have developed alignment-based models of 
similarity. In these models, comparison is 
not just matching features but determin- 
ing how elements correspond to, or align 
with, one another. Matching features are 
aligned to the extent that they play simi- 
lar roles within their entities. For example, 
a car with a green wheel and a truck with 
a green hood both share the feature green, 
but this matching feature may not increase 
their similarity much because the car’s wheel 
does not correspond to the truck’s hood. 
Drawing inspiration from work on analog- 
ical reasoning (Gentner, 1983; Holyoak & 
Thagard, 1995; see Holyoak, Chap. 6), in 
alignment-based models, matching features 
influence similarity more if they belong to 
parts that are placed in correspondence, and 
parts tend to be placed in correspondence if 
they have many features in common and are 
consistent with other emerging correspon- 
dences (Goldstone, 1994a; Markman & Gen- 
tner, 1993a). Alignment-based models make 
purely relational similarity possible (Falken- 
hainer, Forbus, & Gentner, 1989). 

Initial evidence that similarity involves 
aligning scene descriptions comes from 
Markman and Gentner’s (19934) result that 
when subjects are asked to determine corre- 
sponding objects, they tend to make more 
structurally sound choices when they have 
first judged the similarity of the scenes that 
contain the objects. For example, in Figure 
2.5, subjects could be asked which object in 
the bottom set corresponds to the leftmost 


the similarity of the sets were more likely to 
choose the rightmost object — presumably 
because both objects were the smallest ob- 
jects in their sets. Subjects who did not first 
assess similarity had a tendency to select 
the middle object because its size exactly 
matched the target object’s size. These re- 
sults are predicted if similarity judgments 
naturally entail aligning the elements of 
two scenes. Additional research has found 
that relational choices such as “smallest ob- 
ject in its set” tend to influence similar- 
ity judgments more than absolute attributes 
like “3 inches” when the overall amount 
of relational coherency across sets is high 
(Goldstone, Medin, & Gentner, 1991), the 
scenes are superficially sparse rather than 
rich (Gentner & Rattermann, 1991; Mark- 
man & Gentner, 1993a), subjects are given 
more time to make their judgments (Gold- 
stone & Medin, 1994), the judges are adults 
rather than children (Gentner & Toupin, 
1986), and abstract relations are initially cor- 
related with concrete relations (Kotovsky & 
Gentner, 1996). 

Formal models of alignment-based simi- 
larity have been developed to explain how 
feature matches that belong to well-aligned 
elements matter more for similarity than 
matches between poorly aligned elements 
(Goldstone, 1994a; Love, 2000). Inspired 
by work in analogical reasoning (Holyoak & 
Thagard, 1989), Goldstone’s (1994a) SIAM 
model is a neural network with nodes that 
represent hypotheses that elements across 
two scenes correspond to one another. SIAM 
works by first creating correspondences be- 
tween the features of scenes. Once features 
begin to be placed into correspondence, 
SIAM begins to place objects into corre- 
spondence that are consistent with the fea- 
ture correspondences. Once objects begin to 
be put in correspondence, activation is fed 
back down to the feature (mis)matches that 
are consistent with the object alignments. In 
this way, object correspondences influence 
activation of feature correspondences at the 
same time that feature correspondences in- 
fluence the activation of object correspon- 
dences. Activation between nodes spreads 
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Figure 2.5. The target from the gray circles 
could match either the middle black object 
because they are the same size, or the rightmost 
object because both objects are the smallest 
objects in their sets. 


in SIAM by two principles: (1) nodes that 
are consistent send excitatory activation to 
each other, and (2) nodes that are inconsis- 
tent inhibit each another (see also Holyoak, 
Chap. 6). Nodes are inconsistent if they cre- 
ate two-to-one alignments — if two elements 
from one scene would be placed into cor- 
respondence with one element of the other 
scene. Node activations affect similarity via 
the equation 


>, (match value; A;) 


eae 
(2.3) 
where n is the number of nodes in the sys- 
tem, A; is the activation of node i, and the 
match value describes the physical similar- 
ity between the two features placed in cor- 
respondence according to the node i. 

By this equation, the influence of a partic- 
ular matching or mismatching feature across 
two scenes is modulated by the degree to 
which the features have been placed in align- 
ment. Consistent with SIAM, (1) aligned 
feature matches tend to increase similar 
ity more than unaligned feature matches 
(Goldstone, 1994a); (2) the differential in- 
fluence between aligned and unaligned fea- 
ture matches increases as a function of pro- 
cessing time (Goldstone & Medin, 1994); 
(3) this same differential influence increases 


similarity = 


/igétidianacyocorvith the clarity of the alignments (Gold- 


stone, 1994a); and (4) under some circum- 
stances, adding a poorly aligned feature 
match can actually decrease similarity by in- 
terfering with the development of proper 
alignments (Goldstone, 1996). 

Another empirically validated set of pre- 
dictions stemming from an alignment-based 
approach to similarity concerns alignable 
and nonalignable differences (Markman & 
Gentner, 1993b). Nonalignable differences 
between two entities are attributes of one 
entity that have no corresponding attribute 
in the other entity. Alignable differences 
are differences that require the elements 
of the entities first be placed in correspon- 
dence. When comparing a police car with an 
ambulance, a nonalignable difference is that 
police cars have weapons in them, but am- 
bulances do not. There is no clear equivalent 
of weapons in the ambulance. Alignable dif- 
ferences include the following: police cars 
carry criminals to jails rather than carrying 
sick people to hospitals, a police car is a 
car whereas ambulances are vans, and police 
car drivers are policemen rather than emer- 
gency medical technicians. Consistent with 
the role of structural alignment in similar 
ity comparisons, alignable differences influ- 
ence similarity more than nonalignable dif- 
ferences (Markman & Gentner, 1996) and 
are more likely to be encoded in memory 
(Markman & Gentner, 1997). Alignable dif- 
ferences between objects also play a dispro- 
portionately large role in distinguishing be- 
tween different basic-level categories (eg., 
cats and dogs) that belong to the same super- 
ordinate category (e.g., animals) (Markman 
& Wisniewski, 1997). In short, knowing 
these correspondences affects not only how 
much a matching element increases similar- 
ity (Goldstone, 1994a), but also how much 
a mismatching element decreases similarity. 

Thus far, much of the evidence for struc- 
tural alignment in similarity has used some- 
what artificial materials. Often, the systems 
describe how “scenes” are compared, with 
the underlying implication that the elements 
comprising the scenes are not as tightly con- 
nected as elements comprising objects. Still, 
if the structural alignment account proves 
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ble to naturally occurring materials. Toward 
this goal, researchers have considered struc- 
tural accounts of similarity in language do- 
mains. The confusability of words depends 
on structural analyses to predict that “stop” 
is more confusable with “step” than “pest” 
(the “st” match is in the correct location with 
“step” but not “pest”), but more confusable 
with “pest” than “best” (the “p” match counts 
for something even when it is out of place). 
Substantial success has been made on the 
practical problem of determining the struc- 
tural similarity of words (Bernstein, Demor- 
est, & Eberhardt, 1994; Frisch, Broe, & Pier- 
rehumbert, 1995). Structural alignment has 
also been implicated when comparing more 
complex language structures such as sen- 
tences (Bassok & Medin, 1997). Likewise, 
structural similarity has proven to be a use- 
ful notion in explaining consumer prefer- 
ences of commercial products, explaining, 
for example, why new products are viewed 
more favorably when they improve over ex- 
isting products along alignable rather than 
unalignable differences (Zhang & Markman, 
1998). Additional research has shown that 
alignment-based models of similarity pro- 
vide a better account of category-based 
induction than feature-based models (Las- 
saline, 1996). Still other researchers have ap- 
plied structural accounts of similarity to the 
legal domain (Hahn & Chater, 1998; Simon 
& Holyoak, 2002). This area of application 
is promising because the U.S. legal system 
is based on cases and precedents, and cases 
are structurally rich and complex situations 
involving many interrelated parties. Retriev- 
ing a historic precedent and assessing its rel- 
evance to a current case almost certainly in- 
volves aligning representations that are more 
sophisticated than assumed by geometric or 
featural models. 


Transformational Models 


A final historic approach to similarity that 
has been more recently resuscitated is that 
the comparison process proceeds by trans- 
forming one representation into the other. 
A critical step for these models is to spec- 


possible. 

In an early incarnation of a transforma- 
tional approach to cognition broadly con- 
strued, Garner (1974) stressed the notion of 
stimuli that are transformationally equiva- 
lent and are consequently possible alterna- 
tives for each other. In artificial intelligence, 
Shimon Ullman (1996) argued that objects 
are recognized by being aligned with mem- 
orized pictorial descriptions. Once an un- 
known object has been aligned with all can- 
didate models, the best match to the viewed 
object is selected. The alignment operations 
rotate, scale, translate, and topographically 
warp object descriptions. For rigid trans- 
formations, full alignment can be obtained 
by aligning three points on the object with 
three points on the model description. Un- 
like recognition strategies that require struc- 
tural descriptions (e.g., Biederman, 1987; 
Hummel, 2000, 2001), Ullman’s alignment 
does not require an image to be decomposed 
into parts. 

In transformational accounts that are ex- 
plicitly designed to model similarity data, 
similarity is usually defined in terms of trans- 
formational distance. In Wiener-Ehrlich, 
Bart, and Millward’s (1980) generative rep- 
resentation system, subjects are assumed to 
possess an elementary set of transformations 
and invoke these transformations when ana- 
lyzing stimuli. Their subjects saw linear pairs 
of stimuli such as {ABCD, DABC} or two- 
dimensional stimuli such as { oe fo }. Sub- 
jects were required to rate the similarity of 
the pairs. The researchers determined trans- 
formations that accounted for each subject's 
ratings from the set {rotate go degrees, ro- 
tate 180, rotate 270, horizontal reflection, 
vertical reflection, positive diagonal reflec- 
tion, negative diagonal reflection}. Similar- 
ity was assumed to decrease monotonically 
as the number of transformations required 
to make one sequence identical to the other 
increased. 

Imai (1977) made a similar claim. 
The stimuli used were sequences such as 
XXOXXXOXXXOX, where Xs represent 
white ovals and Os represent black ovals. 
The four basic transformations were mirror 
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phase shift (XXXXXOO > XXXXOOX), 
reversal (XXXXXOO — OOOOOXX), 
and wavelength (XXOOXXOO — XOX- 
OXOXO). The researcher found that se- 
quences that are two transformations re- 
moved (eg., XXXOXXXOXXXO and 
OOXOOOXOOOXO require a phase shift 
and a reversal to be equated) are rated to 
be less similar than sequences that can be 
made identical with one transformation. In 
addition, sequences that can be made identi- 
cal by more than one transformation (XOX- 
OXOXO and OXOXOXOX can be made 
identical by mirror image, phase shift, or 
reversal transformations) are more similar 
than sequences that have only one identity- 
producing transformation. 

More recent work has followed up on 
Imai’s research and generalized it to stimulus 
materials, including arrangements of Lego 
bricks, geometric complexes, and sets of col- 
ored circles (Hahn, Chater, & Richardson, 
2003). According to these researchers’ ac- 
count, the similarity between two entities 
is a function of the complexity required to 
transform the representation of one into the 
representation of the other. The simpler the 
transformation, the more similar they are as- 
sumed to be. The complexity of a transfor 
mation is determined in accord with Kol- 
mogorov complexity theory (Li & Vitanyi, 
1997), according to which the complexity of 
a representation is the length of the short- 
est computer program that can generate 
that representation. For example, the condi- 
tional Kolmogorov complexity between the 
sequence 12345678 and234567 
8 9 is small because the simple instructions 
add 1 to each digit and subtract 1 from each 
digit suffice to transform one into the other. 
Experiments by Hahn et al. (2003) demon- 
strate that once reasonable vocabularies of 
transformation are postulated, transforma- 
tional complexity does indeed predict sub- 
jective similarity ratings. 

It is useful to compare and con- 
trast alignment-based and transformational 
accounts of similarity. Both approaches 
place scene elements into correspondence. 
Whereas the correspondences are explicitly 


they are implicit in transformational align- 
ment. The transformational account often 
does produce globally consistent correspon- 
dences — for example, correspondences that 
obey the one-to-one mapping principle; 
however, this consistency is a consequent of 
applying a patternwide transformation and is 
not enforced by interactions between emerg- 
ing correspondences. It is revealing that 
transformational accounts have been applied 
almost exclusively to perceptual stimuli, 
whereas structural accounts are most often 
applied to conceptual stimuli such as sto- 
ries, proverbs, and scientific theories (there 
are also notable structural accounts in per- 
ception, i.e, Biederman, 1987; Hummel, 
2000; Hummel & Biederman, 1992; Marr & 
Nishihara, 1978). Defining a set of con- 
strained transformations is much more ten- 
able for perceptual stimuli. The conceptual 
similarity between an atom and the solar 
system could possibly be discovered by 
transformations. As a start, a miniaturization 
transformation could be applied to the so- 
lar system. However, this single transforma- 
tion is not nearly sufficient; a nucleus is not 
simply a small sun. The transformations that 
would turn the solar system into an atom are 
not readily forthcoming. If we allow transfor- 
mations such as an “earth-becomes-electron” 
transformation, then we are simply reex- 
pressing the structural alignment approach 
and its part-by-part alignment of relations 
and objects. 

Some similarity phenomena that are well 
explained by structural alignment are not 
easily handled by transformations. To ac- 
count for the similarity of “BCDCB” and 
“ABCDCBA’ we could introduce the fairly 
abstract transformation “add the leftmost 
letter’s predecessor to both sides of string.” 
However, the pair “LMN” and “KLMNK’” do 
not seem as similar as the earlier pair, even 
though the same transformation is applied. 
A transformation of the form “if the struc- 
ture is symmetric, then add the preceding 
element in the series to both ends of the 
string” presupposes exactly the kind of anal- 
ysis in defining “symmetric” and “preceding” 
that are the bread and butter of propositional 
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For this reason, one fertile research di- 
rection would be to combine alignment- 
based accounts’ focus on representing the 
internal structure within individual scenes 
with the constraints that transformational 
accounts provide for establishing psycholog- 
ically plausible transformations (Hofstadter, 
1997; Mitchell, 1993). 


Conclusions and Further Directions 


To provide a partial balance to our largely 
historic focus on similarity, we conclude by 
raising some unanswered questions for the 
field. These questions are rooted in a desire 
to connect the study of similarity to cogni- 
tion as a whole. 


Is Similarity Flexible Enough to Provide 
Useful Explanations of Cognition? 


The study of similarity is typically justi- 
fied by the argument that so many theo- 
ries in cognition depend on similarity as a 
theoretical construct. An account of what 
makes problems, memories, objects, and 
words similar to one another often provides 
the backbone for our theories of problem 
solving, attention, perception, and cogni- 
tion. As William James put it, “This sense 
of Sameness is the very keel and backbone 
of our thinking” (James, 1890/1950, p. 459). 

However, others have argued that simi- 
larity is not flexible enough to provide a suf- 
ficient account, although it may be a nec- 
essary component. There have been many 
empirical demonstrations of apparent disso- 
ciations between similarity and other cog- 
nitive processes, most notably categoriza- 
tion. Researchers have argued that cognition 
is frequently based on theories (Murphy & 
Medin, 1985), rules (Sloman, 1996; Smith & 
Sloman, 1994), or strategies that go beyond 
“mere” similarity. To take an example from 
Murphy and Medin (1985), consider a man 
jumping into aswimming pool fully clothed. 
This man may be categorized as drunk be- 
cause we have a theory of behavior and 
inebriation that explains the man’s action. 


gorization of the man’s behavior does not 
depend on matching the man’s features to 
the category drunk’s features. It is highly un- 
likely that the category drunk would have 
such a specific feature as “jumps into pools 
fully clothed.” It is not the similarity be- 
tween the instance and the category that de- 
termines the instance’s classification; it is the 
fact that our category provides a theory that 
explains the behavior. 

Developmental psychologists have ar- 
gued that even young children have inchoate 
theories that allow them to go beyond su- 
perficial similarities in creating categories 
(Carey, 1985; Gelman & Markman, 1986; 
Keil, 1989). For example, Carey (1985) ob- 
served that children choose a toy monkey 
over a worm as being more similar to a hu- 
man, but that when they are told that hu- 
mans have spleens are more likely to in- 
fer that the worm has a spleen than that 
the toy monkey does. Thus, the categoriza- 
tion of objects into “spleen” and “no spleen” 
groups does not appear to depend on the 
same knowledge that guides similarity judg- 
ments. Adults show similar dissociations be- 
tween similarity and categorization. In an 
experiment by Rips (1989), an animal that 
is transformed (by toxic waste) from a bird 
into something that looks like an insect is 
judged by subjects to be more similar to an 
insect but is still judged to be a bird. Again, 
the category judgment seems to depend on 
biological, genetic, and historic knowledge, 
whereas the similarity judgments seems to 
depend more on gross visual appearance (see 
also Keil, 1989; Rips & Collins, 1993). 

Despite the growing body of evidence 
that similarity appraisals do not always track 
categorization decisions, there are still some 
reasons to be sanguine about the contin- 
ued explanatory relevance of similarity. Cat- 
egorization itself may not be completely 
flexible. People are influenced by similarity 
despite the subjects’ intentions and the ex- 
perimenters’ instructions (Smith & Sloman, 
1994). Allen and Brooks (1991) gave sub- 
jects an easy rule for categorizing cartoon 
animals into two groups. Subjects were then 
transferred to the animals that looked very 
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longed in a different category. These animals 
were categorized more slowly and less accu- 
rately than animals that were equally similar 
to an old animal but also belonged in the 
same category as the old animal. Likewise, 
Palmeri (1997) showed that even for the 
simple task of counting the number of dots, 
subjects’ performances are improved when 
a pattern of dots is similar to a previously 
seen pattern with the same numerosity and 
worse when the pattern is similar to a previ- 
ously seen pattern with different numeros- 
ity. People seem to have difficulty ignoring 
similarities between old and new patterns 
even when they know a straightforward and 
perfectly accurate categorization rule. 
There may be a mandatory considera- 
tion of similarity in many categorization 
judgments (Goldstone, 1994b), adding con- 
straints to categorization. At the same time, 
similarity may be more flexible and sophisti- 
cated than commonly acknowledged (Jones 
& Smith, 1993) and this may also serve to 
bridge the gap between similarity and high- 
level cognition. Krumhansl (1978) argued 
that similarity between objects decreases 
when they are surrounded by many close 
neighbors that were also presented on pre- 
vious trials (also see Wedell, 1994). Tversky 
(1977) obtained evidence for an extension 
effect according to which features influence 
similarity judgments more when they vary 
within an entire set of stimuli. Items pre- 
sented within a particular trial also influence 
similarity judgments. Perhaps the most fa- 
mous example of this is Tversky’s (1977) 
diagnosticity effect according to which fea- 
tures that are diagnostic for relevant clas- 
sifications will have disproportionate influ- 
ence on similarity judgments. More recently, 
Medin, Goldstone, and Gentner (1993) ar 
gued that different comparison standards are 
created, depending on the items that are 
present on a particular trial. Other research 
has documented intransitivities in similarity 
judgments situations in which A is judged 
to be more similar to T than is B, B is more 
similar to T than is C, and C is more similar 
to T than is A (Goldstone, Medin, & Halber- 
stadt, 1997). This kind of result also suggests 


larity of objects are determined, in part, by 
the compared objects themselves. 

Similarity judgments not only depend on 
the context established by recently exposed 
items, simultaneously presented items, and 
inferred contrast sets, but also on the ob- 
server. Suzuki, Ohnishi, and Shigemasu 
(1992) showed that similarity judgments de- 
pend on level of expertise and goals. Expert 
and novice subjects were asked to solve the 
Tower of Hanoi puzzle and judge the sim- 
ilarity between the goal and various states. 
Experts’ similarity ratings were based on 
the number of moves required to trans- 
form one position to the other. Less expert 
subjects tended to base their judgments on 
the number of shared superficial features. 
Similarly, Hardiman, Dufresne, and Mestre 
(1989) found that expert and novice physi- 
cists evaluate the similarity of physics prob- 
lems differently, with experts basing simi- 
larity judgments more on general principles 
of physics than on superficial features (see 
Sjoberg, 1972, for other expert/novice dif- 
ferences in similarity ratings). The depen- 
dency of similarity on observer-, task-, and 
stimulus-defined contexts offers the promise 
that it is indeed flexible enough to subserve 
cognition. 


Is Similarity Too Flexible to Provide 
Useful Explanations of Cognition? 


As a response to the skeptic of similarity’s 
usefulness, the preceding two paragraphs 
could have the exact opposite of their in- 
tended effect. The skeptic might now be- 
lieve that similarity is much too flexible to be 
a stable ground for cognition. In fact, Nelson 
Goodman (i972) put forth exactly this 
claim, maintaining that the notion of similar- 
ity is either vague or unnecessary. He argued 
that “when to the statement that two things 
are similar we add a specification of the 
property that they have in common... we 
render it [the similarity statement] superflu- 
ous” (p. 445). That is, all the potential ex- 
planatory work is done by the “with respect 
to property Z” clause and not by the similar 
ity statement. Instead of saying “this object 
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A items with respect to the property ‘red’,” 
we can simplify matters by removing any no- 
tion of similarity with “this object belongs to 
category A because it is red.” 

There are reasons to resist Goodman’s 
conclusion that “similarity tends under anal- 
ysis either to vanish entirely or to require 
for its explanation just what it purports to 
explain” (p. 446). In most cases, similarity 
is useful precisely because we cannot flesh 
out the “respect to property Z” clause with 
just a single property. Evidence suggests that 
assessments of overall similarity are natural 
and perhaps even “primitive.” Evidence from 
children’s perception of similarity suggests 
that children are particularly likely to judge 
similarity on the basis of many integrated 
properties rather than analysis into dimen- 
sions. Even dimensions that are perceptu- 
ally separable are treated as fused in sim- 
ilarity judgments (Smith & Kemler, 1978). 
Children younger than 5 years of age tend 
to classify on the basis of overall similar- 
ity and not on the basis of a single criterial 
attribute (Keil, 1989; Smith, 1989). Chil- 
dren often have great difficulty identifying 
the dimension along which two objects vary, 
even though they can easily identify that the 
objects are different in some way (Kemler, 
1983). Smith (1989) argued that it is rel- 
atively difficult for young children to say 
whether two objects are identical on a par- 
ticular property but relatively easy for them 
to say whether they are similar across many 
dimensions. 

There is also evidence that adults of- 
ten have an overall impression of similar- 
ity without analysis into specific properties. 
Ward (1983) found that adult subjects who 
tended to group objects quickly also tended 
to group objects like children by consider- 
ing overall similarity across all dimensions 
instead of maximal similarity on one dimen- 
sion. Likewise, Smith and Kemler (1984) 
found that adults who were given a distract- 
ing task produced more judgments by over- 
all similarity than subjects who were not. To 
the extent that similarity is determined by 
many properties, it is less subject to drastic 
context-driven changes. Furthermore, inte- 


a single assessment of similarity becomes 
particularly important. The four approaches 
to similarity described in the previous sec- 
tion provide methods for integrating multi- 
ple properties into a single similarity judg- 
ment and, as such, go significantly beyond 
simply determining a single “property Z” to 
attend. 

A final point to make about the poten- 
tial overflexibility of similarity is that, al- 
though impressions of similarity can change 
with context and experience, automatic and 
“generic” assessments of similarity typically 
change slowly and with considerable iner- 
tia. Similarities that were once effortful and 
strategic become second nature to the organ- 
ism. Roughly speaking, this is the process of 
perceiving what was once a conceptual similar- 
ity. At first, the novice mycologist explicitly 
uses rules for perceiving the dissimilarity be- 
tween the pleasing Agaricus Bisporus mush- 
room and the deadly Amanita Phalloides. 
With time, this dissimilarity ceases to be ef- 
fortful and rule based and becomes percep- 
tual and phenomenologically direct. When 
this occurs, the similarity becomes generic 
and default and can be used as the ground 
for new strategic similarities. In this way, our 
cognitive abilities gradually attain sophisti- 
cation by treating territory as level ground 
that once made for difficult mental climbing. 
A corollary of this contention is that our de- 
fault impression of similarity does not typi- 
cally mislead us; it is explicitly designed to 
lead us to see relations between things that 
often function similarly in our world. Peo- 
ple, with good reason, expect their default 
similarity assessments to provide good clues 
about where to uncover directed, nonappar- 
ent similarities (Medin & Ortony, 1989). 


Should “Similarity” Even Be a Field 
of Study Within Cognitive Science? 


This survey has proceeded under the conve- 
nient fiction that it is possible to tell a gen- 
eral story for how people compare things. 
One reason to doubt this is that the meth- 
ods used for assessing similarity have large 
effects on the resulting similarity viewed. 
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equivalent to similarity as measured by 
perceptual discriminability. Although these 
measures correlate highly, systematic differ- 
ences are found (Podgorny & Garner, 1979; 
Sergent & Takane, 1987). For example, Beck 
(1966) found that an upright T is rated 
as more similar to a tilted T than an up- 
right L but that it is also more likely to be 
perceptually grouped with the upright Ls. 
Previously reviewed experiments indicate 
the nonequivalence of assessments that use 
similarity versus dissimilarity ratings, cat- 
egorization versus forced-choice similarity 
judgments, or speeded versus leisurely judg- 
ments. In everyday discourse we talk about 
the similarity of two things, forgetting that 
this assessment depends on a particular task 
and circumstance. 

Furthermore, it may turn out that the cal- 
culation of similarity is fundamentally differ- 
ent for different domains (see Medin, Lynch, 
& Solomon, 2000, for a thoughtful discus- 
sion of this issue). To know how to calculate 
the similarity of two faces, one would need 
to study faces specifically and the eventual 
account need not inform researchers inter- 
ested in the similarity of words, works of 
music, or trees. A possible conclusion is that 
similarity is not a coherent notion at all. The 
term similarity, similar to the bug or family 
values, may not pick out a consolidated or 
principled set of things. 

Although we sympathize with the im- 
pulse toward domain-specific accounts of 
similarity, we also believe in the value of 
studying general principles of comparison 
that potentially underlie many domains. Al- 
though we do not know whether general 
principles exist, one justification for pursu- 
ing them is the large payoff that would re- 
sult from discovering these principles if they 
do exist. A historically fruitful strategy, ex- 
emplified by Einstein’s search for a law to 
unify gravitational and electromagnetic ac- 
celeration and Darwin’s search for a uni- 
fied law to understand the origins of humans 
and other animals, has been to understand 
differences as parametric variations within 
a single model. Finding differences across 
tasks does not necessarily point to the in- 


spective would use these task differences as 
an illuminating source of information in de- 
veloping a unified account. The systematic 
nature of these task differences should stim- 
ulate accounts that include a formal descrip- 
tion not only of stimulus components but 
also of task components. Future success in 
understanding the task of comparison may 
depend on comparing tasks. 
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CHAPTER 3 


Concepts and Categories: Memory, 
Meaning, and Metaphysics 


Douglas L. Medin 
Lance J. Rips 


Introduction 


The concept of concepts is difficult to define, 
but no one doubts that concepts are funda- 
mental to mental life and human commu- 
nication. Cognitive scientists generally agree 
thata concept isa mental representation that 
picks out a set of entities, or a category. That 
is, concepts refer, and what they refer to are 
categories. It is also commonly assumed that 
category membership is not arbitrary, but 
rather a principled matter. What goes into 
a category belongs there by virtue of some 
lawlike regularities. However, beyond these 
sparse facts, the concept CONCEPT is up 
for grabs. As an example, suppose you have 
the concept TRIANGLE represented as “a 
closed geometric form having three sides.” 
In this case, the concept is a definition, but 
it is unclear what else might be in your trian- 
gle concept. Does it include the fact that ge- 
ometry books discuss them (although some 
don’t) or that they have 180 degrees (al- 
though in hyperbolic geometry none do)? It 
is also unclear how many concepts have def- 
initions or what substitutes for definitions in 
ones that do not. 


Our goal in this chapter is to provide an 
overview of work on concepts and categories 
in the last half-century. There has been such 
a consistent stream of research during this 
period that one reviewer of this literature, 
Gregory Murphy (2002), was compelled to 
call his monograph, The Big Book of Con- 
cepts. Our task is eased by recent reviews, 
including Murphy’s aptly named one (e.g., 
Medin, Lynch, & Solomon, 2000; Murphy, 
2002; Rips, 2001; Wisniewski, 2002). Their 
thoroughness gives us the luxury of writ- 
ing a review focused on a single perspective 
or “flavor” — the relation between concepts, 
memory, and meaning. 

The remainder of this chapter is orga- 
nized as follows. In the rest of this section, 
we briefly describe some of the tasks or 
functions that cognitive scientists have ex- 
pected concepts to perform. This will pro- 
vide a road map to important lines of re- 
search on concepts and categories. Next, we 
return to developments in the late 1960s and 
early 1970s that raised the exciting possi- 
bility that laboratory studies could provide 
deep insights into both concept represen- 
tations and the organization of (semantic) 
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lapse of this optimism and the ensuing lines 
of research that, however intriguing and im- 
portant, essentially ignored questions about 
semantic memory. Next, we trace a number 
of relatively recent developments under the 
somewhat whimsical heading, “Psychometa- 
physics.” This is the view that concepts are 
embedded in (perhaps domain-specific) the- 
ories. This will set the stage for returning 
to the question of whether research on con- 
cepts and categories is relevant to semantics 
and memory organization. We use that ques- 
tion to speculate about future developments 
in the field. In this review, we use all caps to 
refer to concepts and quotation marks to re- 
fer to linguistic expressions. 


Functions of Concepts 


For purposes of this chapter, we collapse the 
many ways people can use concepts into two 
broad functions: categorization and com- 
munication. The conceptual function that 
most research has targeted is categorization, 
the process by which mental representations 
(concepts) determine whether some entity 
is a member of a category. Categorization 
enables a wide variety of subordinate func- 
tions because classifying something as a cat- 
egory member allows people to bring their 
knowledge of the category to bear on the 
new instance. Once people categorize some 
novel entity, for example, they can use rel- 
evant knowledge for understanding and pre- 
diction. Recognizing a cylindrical object as a 
flashlight allows you to understand its parts, 
trace its functions, and predict its behavior. 
For example, you can confidently infer that 
the flashlight will have one or more batter- 
ies, will have some sort of switch, and will 
normally produce a beam of light when the 
switch is pressed. 

Not only do people categorize in order to 
understand new entities, but they also use 
the new entities to modify and update their 
concepts. In other words, categorization sup- 
ports learning. Encountering a member of a 
category with a novel property — for exam- 
ple, a flashlight that has a siren for emer- 


being incorporated into the conceptual rep- 
resentation. In other cases, relations between 
categories may support inference. For exam- 
ple, finding out that flashlights can contain 
sirens may lead you to entertain the idea that 
cell phones and fire extinguishers might also 
contain sirens. Hierarchical conceptual re- 
lations support both inductive and deduc- 
tive reasoning. If all trees contain xylem and 
hawthorns are trees, then one can deduce 
that hawthorns contain xylem. In addition, 
finding out that white oaks contain phloem 
provides some support for the inductive in- 
ference that other kinds of oaks contain 
phloem. People also use categories to in- 
stantiate goals in planning (Barsalou, 1983). 
For example, a person planning to do some 
night fishing might create an ad hoc con- 
cept, THINGS TO BRING ON A NIGHT 
FISHING TRIP, which would include a 
fishing rod, tackle box, mosquito repellent, 
and flashlight. 

Concepts are also centrally involved in 
communication. Many of our concepts corre- 
spond to lexical entries, such as the English 
word “flashlight.” For people to avoid mis- 
understanding each other, they must have 
comparable concepts in mind. If A’s con- 
cept of cell phone corresponds with B’s con- 
cept of flashlight, it will not go well if A 
asks B to make a call. An important part 
of the function of concepts in communica- 
tion is their ability to combine to create an 
unlimited number of new concepts. Nearly 
every sentence you encounter is new — one 
you have never heard or read before — and 
concepts (along with the sentence’s gram- 
mar) must support your ability to under- 
stand it. Concepts are also responsible for 
more ad hoc uses of language. For exam- 
ple, from the base concepts of TROUT and 
FLASHLIGHT, you might create a new con- 
cept, TROUT FLASHLIGHT, which in the 
context of our current discussion would pre- 
sumably be a flashlight used when trying 
to catch trout (and not a flashlight with a 
picture of a trout on it, although this may 
be the correct interpretation in some other 
context). A major research challenge is to 
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bination and how they relate to commu- 
nicative contexts (see Fodor, 1994, 1998; 
Gleitman & Papafragou, Chap. 26 ; Hamp- 
ton, 1997; Partee, 1995; Rips, 1995; Wisni- 
ewski, 1997). 


Overview 


So far, we have introduced two roles for con- 
cepts: categorization (broadly construed) 
and communication. These functions and 
associated subfunctions are important to 
bear in mind because studying any one in 
isolation can lead to misleading conclusions 
about conceptual structure (see Solomon, 
Medin, & Lynch, 1999, for a review bear- 
ing on this point). At this juncture, how- 
ever, we need to introduce one more plot 
element into the story we are telling. Pre- 
sumably everything we have been talking 
about has implications for human memory 
and memory organization. After all, con- 
cepts are mental representations, and people 
must store these representations somewhere 
in memory. However, the relation between 
concepts and memory may be more inti- 
mate. A key part of our story is what we 
call “the semantic memory marriage,” the 
idea that memory organization corresponds 
to meaningful relations between concepts. 
Mental pathways that lead from one concept 
to another — for example, from ELBOW to 
ARM - represent relations like IS A PART 
OF that link the same concepts. Moreover, 
these memory relations may supply the con- 
cepts with all or part of their meaning. By 
studying how people use concepts in cat- 
egorizing and reasoning, researchers could 
simultaneously explore memory structure 
and the structure of the mental lexicon. In 
other words, the idea was to unify catego- 
rization, communication (in its semantic as- 
pects), and memory organization. As we will 
see, this marriage was somewhat troubled, 
and there are many rumors about its break- 
up. However, we are getting ahead of our 
story. The next section begins with the ini- 
tial romance. 


Research on concepts in the middle of the 
last century reflected a gradual easing away 
from behaviorist and associative learning tra- 
ditions. The focus, however, remained on 
learning. Most of this research was con- 
ducted in laboratories using artificial cate- 
gories (a sample category might be any geo- 
metric figure that is both red and striped) 
and directed at one of two questions: (1) 
Are concepts learned by gradual increases 
in associative strength, or is learning all 
or none (Levine, 1962; Trabasso & Bower, 
1968)?, and (2) Which kinds of rules or 
concepts (eg., disjunctive, such as RED 
OR STRIPED, versus conjunctive, such as 
RED AND STRIPED) are easiest to learn 
(Bourne, 1970; Bruner, Goodnow, & Austin, 
1956; Restle, 1962)? 

This early work tended either to ignore 
real world concepts (Bruner et al., 1956, rep- 
resent something of an exception here) or 
to assume implicitly that real world con- 
cepts are structured according to the same 
kinds of arbitrary rules that defined the 
artificial ones. According to this tradition, 
category learning is equivalent to finding 
out the definitions that determine category 
membership. 


Early Theories of Semantic Memory 


Although the work on rule learning set the 
stage for what was to follow, two develop- 
ments associated with the emergence of cog- 
nitive psychology dramatically changed how 
people thought about concepts. 


TURNING POINT 1: MODELS 
OF MEMORY ORGANIZATION 

The idea of programming computers to 
do intelligent things (artificial intelligence 
or AI) had an important influence on the 
development of new approaches to con- 
cepts. Quillian (1967) proposed a hierarchi- 
cal model for storing semantic information 
in a computer that was quickly evaluated 
as a candidate model for the structure of 
human memory (Collins & Quillian, 1969). 
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a memory hierarchy that is similar to what 
the Quillian model suggests. 

First, note that the network follows a 
principle of cognitive economy. Properties 
true of all animals, such as eating and breath- 
ing, are stored only with the animal con- 
cept. Similarly, properties that are generally 
true of birds are stored at the bird node, 
but properties distinctive to individual kinds 
(e.g., being yellow) are stored with the spe- 
cific concept nodes they characterize (e.g., 
CANARY). A property does not have to 
be true of all subordinate concepts to be 
stored with a superordinate. This is illus- 
trated in Figure 3.1, where CAN FLY is as- 
sociated with the bird node; the few excep- 
tions (eg., flightlessness for ostriches) are 
stored with particular birds that do not fly. 
Second, note that category membership is 
defined in terms of positions in the hierar- 
chical network. For example, the node for 
CANARY does not directly store the infor- 
mation that canaries are animals; instead, 
membership would be “computed” by mov- 
ing from the canary node up to the bird node 
and then from the bird node to the animal 
node. It is as if a deductive argument is be- 
ing constructed of the form, “All canaries are 
birds and all birds are animals and therefore 
all canaries are animals.” 

Although these assumptions about cog- 
nitive economy and traversing a hierarchi- 
cal structure may seem speculative, they 
yield a number of testable predictions. As- 
suming traversal takes time, one would pre- 
dict that the time needed for people to ver- 
ify properties of concepts should increase 
with the network distance between the con- 
cept and the property. For example, people 
should be faster to verify that a canary is yel- 
low than to verify that a canary has feath- 
ers and faster to determine that a canary 
can fly than that a canary has skin. Collins 
and Quillian found general support for these 
predictions. 


TURNING POINT 2: NATURAL CONCEPTS 
AND FAMILY RESEMBLANCE 


The work on rule learning suggested that 
children (and adults) might learn concepts 


correct definition. In the early 1970s, how- 
ever, Eleanor Rosch and her associates (e.g., 
Rosch, 1973; Rosch & Mervis, 1975) argued 
that most everyday concepts are not orga- 
nized in terms of the sorts of necessary and 
sufficient features that would form a (con- 
junctive) definition for a category. Instead, 
such concepts depend on properties that are 
generally true but need not hold for every 
member. Rosch’s proposal was that concepts 
have a “family resemblance” structure: What 
determines category membership is whether 
an example has enough characteristic prop- 
erties (is enough like other members) to be- 
long to the category. 

One key idea associated with this view 
is that not all category members are equally 
“good” examples of a concept. If member- 
ship is based on characteristic properties and 
some members have more of these proper- 
ties than others, then the ones with more 
characteristic properties should better ex- 
emplify the category. For example, canaries 
but not penguins have the characteristic 
bird properties of flying, singing, and build- 
ing a nest, so one would predict that ca- 
naries would be more typical birds than pen- 
guins. Rosch and Mervis (1975) found that 
people do rate some examples of a cate- 
gory to be more typical than others and 
that these judgments are highly correlated 
with the number of characteristic features 
an example possesses. They also created 
artificial categories conforming to family 
resemblance structures, and produced typ- 
icality effects on learning and on goodness- 
of-example judgments. 

Rosch and her associates (Rosch, Mervis, 
Gray, Johnson, & Boyes-Braem, 1976) also 
argued that the family resemblance view 
has important implications for understand- 
ing concept hierarchies. Specifically, they 
suggested that the correlational structure 
of features (instances that share some fea- 
tures tend to share others) creates natu- 
ral “chunks” or clusters of instances that 
correspond to what they referred to as 
basic-level categories. For example, having 
feathers tends to correlate with nesting in 
trees (among other features) in the animal 
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Figure 3.1. A semantic network. 


kingdom, and having gills with living in 
water. The first cluster tends to isolate birds, 
whereas the second picks out fish. The 
general idea is that these basic-level cat- 
egories provide the best compromise be- 
tween maximizing within-category similar- 
ity (birds tend to be quite similar to each 
other) and minimizing between-category 
similarity (birds tend to be dissimilar to 
fish). Rosch et al. showed that basic-level 
categories are preferred by adults in nam- 
ing objects, are learned first by children, 
are associated with the fastest categoriza- 
tion reaction times, and have a number of 
other properties that indicate their special 
conceptual status. 

Turning points 1 and 2 are not unrelated. 
To be sure, the Collins and Quillian model, 
as initially presented, would not predict typ- 
icality effects (but see Collins & Loftus, 
1975), and it was not obvious that it con- 
tained anything that would predict the im- 
portance of basic-level categories. Nonethe- 
less, these conceptual breakthroughs led to 
an enormous amount of research premised 
on the notion that memory groups concepts 
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according to their similarity in meaning, 
where similarity is imposed by correlated 
and taxonomic structure (see Anderson & 
Bower, 1973, and Norman & Rumelhart, 
1975, for theories and research in this tra- 
dition, and Goldstone & Son, Chap. 2, for 
current theories of similarity). 


Fragmentation of Semantics and Memory 


Prior to about 1980, most researchers in 
this field saw themselves as investigating “se- 
mantic memory” — the way that long-term 
memory organizes meaningful information. 
Around 1980, the term itself became passé, 
at least for this same group of researchers, 
and the field regrouped under the banner 
of “Categories and Concepts” (the title of 
Smith & Medin’s, 1981, synthesis of research 
in this area). At the time, these researchers 
may well have seen this change as a purely 
nominal one, but we suspect it reflected a 
retreat from the claim that semantic mem- 
ory research had much to say about either 
semantics or memory. How did this change 
come about? 


42 THE CAMBRIDGE HANDBOOK OF THINKING AND REASONING 


MEMORY ORGANPFRRE RCH E bby: Https FE THianaryaesastudiously avoided a stand on memory 


Initial support for a Quillian-type mem- 
ory organization came from Quillian’s own 
collaboration with Allan Collins (Collins & 
Quillian, 1969), which we mentioned ear- 
lier. Related evidence also came from ex- 
periments on lexical priming: Retrieving the 
meaning of a word made it easier to retrieve 
the meaning of semantically related words 
(e.g., Meyer & Schvanevelt, 1971). In these 
lexical decision tasks, participants viewed 
a single string of letters on each trial and 
decided, under reaction time instructions, 
whether the string was a word (“daisy”) or 
a nonword (“raisy”). The key result was that 
participants were faster to identify a string 
as a word if it followed a semantically re- 
lated item rather than an unrelated one. For 
example, reaction time for “daisy” was faster 
if, on the preceding trial, the participant had 
seen “tulip” rather than “steel.” This priming 
effect is consistent with the hypothesis that 
activation from one concept spreads through 
memory to semantically related ones. 

Later findings suggested, however, that 
the relation between word meaning and 
memory organization was less straightfor- 
ward. For example, the typicality findings 
(see turning point 2) suggested that time to 
verify sentences of the form An X is a Y 
(eg., “A finch is a bird”) might be a func- 
tion of the overlap in the information that 
participants knew about the meaning of X 
and Y rather than the length of the pathway 
between these concepts. The greater the in- 
formation overlap — for example, the greater 
the number of properties that the referents 
of X and Y shared — the faster the time to 
confirm a true sentence and the slower the 
time to disconfirm a false one. For exam- 
ple, if you know a lot of common informa- 
tion about finches and birds but only a lit- 
tle common information about ostriches and 
birds, you should be faster to confirm the 
sentence “A finch is a bird” than “An ostrich 
is a bird.” Investigators proposed several the- 
ories along these lines that made minimal 
commitments to the way memory organized 
its mental concepts (McCloskey & Glucks- 
berg, 1979; Smith, Shoben, & Rips, 1974; 
Tversky, 1977). Rosch’s (1978) theory like- 


structure. 

Evidence from priming in lexical decision 
tasks also appeared ambiguous. Although 
priming occurs between associatively related 
words (e.g., “bread” and “butter”), it is not 
so clear that there is priming between se- 
mantically linked words in the absence of 
such associations. It is controversial whether, 
for example, there is any automatic activa- 
tion between “glove” and “hat” despite their 
joint membership in the clothing category 
(see Balota, 1994, for a discussion). If mem- 
ory is organized on a specifically semantic 
basis — on the basis of word meanings — then 
there should be activation between seman- 
tically related words even in the absence of 
other sorts of associations. A meta-analysis 
by Lucas (2000) turned up a small effect 
of this type, but as Lucas noted, it is diffi- 
cult to tell whether the semantically related 
pairs in these experiments are truly free of 
associations. 

The idea that memory organization mim- 
ics semantic organization is an attractive one, 
and memory researchers attempted to mod- 
ify the original Quillian approach to bring 
it into line with the results we have just re- 
viewed (e.g., Collins & Loftus, 1975). The 
data from the sentence verification and lex- 
ical decision experiments, however, raised 
doubts about these theories. Later in this 
chapter, we consider whether newer tech- 
niques can give us a better handle on the 
structure of memory, but for now let’s turn 
to the other half of the memory equals 
meaning equation. 


SEMANTICS 


Specifying the meaning of individual words 
is one of the goals of semantics, but only one. 
Semantics must also account for the mean- 
ing of phrases, sentences, and longer units 
of language. One problem in using a theory 
like Quillian’s as a semantic theory is how to 
extend its core idea — that the meaning of a 
word is the coordinates of a node in mem- 
ory structure — to explain how people under- 
stand meaningful phrases and sentences. Of 
course, Quillian’s theory and its successors 
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correspond to preexisting memory path- 
ways. We have already seen how the model 
can explain our ability to confirm sentences 
such as “A daisy is a flower.” However, what 
about sentences that do not correspond to 
preexisting connections — sentences such as 
“Fred placed a daisy in a lunchbox”? 

The standard approach to sentence mean- 
ing in linguistics is to think of the mean- 
ing of sentences as built from the meaning 
of the words that compose them, guided 
by the sentence’s grammar (e.g., Chierchia 
& McConnell-Ginet, 1990). We can under- 
stand sentences that we have never heard or 
read before, and because there are an enor- 
mous number of such novel sentences, we 
cannot learn their meaning as single chunks. 
It therefore seems quite likely that we com- 
pute the meaning of these new sentences. 
However, if word meaning is the position of 
a node in a network, it is hard to see how this 
position could combine with other positions 
to produce sentence meanings. What is the 
process that could take the relative network 
positions for FRED, PLACE, DAISY, IN, and 
LUNCHBOxX and turn them into a meaning 
for “Fred placed a daisy in a lunchbox”? 

If you like the notion of word meaning 
as relative position, then one possible solu- 
tion to the problem of sentence meaning 
is to connect these positions with further 
pathways. Because we already have an ar- 
ray of memory nodes and pathways at our 
disposal, why not add a few more to en- 
code the meaning of a new sentence? Per 
haps the meaning of “Fred placed a daisy in 
the lunchbox” is given by a new set of path- 
ways that interconnect the nodes for FRED, 
PLACE, DAISY, and so on, in a configura- 
tion corresponding to the sentence’s struc- 
ture. This is the route that Quillian and his 
successors took (e.g., Anderson & Bower, 
1973; Norman & Rumelhart, 1975; Quil- 
lian, 1969), but it comes at a high price. 
Adding new connections changes the over- 
all network configuration and thereby al- 
ters the meaning of the constituent terms. 
(Remember: Meaning is supposed to be rel- 
ative position.) However, it is far from ob- 
vious that encoding incidental facts alters 


ple, that learning the sentence about Fred 
changes the meaning of “daisy.” Moreover, 
because meaning is a function of the en- 
tire network, the same incidental sentences 
change the meaning of all words. Learning 
about Fred’s daisy placing shifts the meaning 
of seemingly unrelated words such as “hip- 
popotamus” if only a bit. 

Related questions apply to other psycho- 
logical theories of meaning in the semantic 
memory tradition. To handle the typicality 
results mentioned earlier, some investigators 
proposed that the mental representation of 
a category such as daisies consists of a pro- 
totype for that category — for example, a 
description of a good example of a daisy 
(e.g., Hampton, 1979; McCloskey & Glucks- 
berg, 1979). The meaning of “daisy” in these 
prototype theories would thus include de- 
fault characteristics, such as growing in gar- 
dens, that apply to most, but not all, daisies. 
We discuss prototype theories in more de- 
tail soon, but the point for now is that pro- 
totype representations for individual words 
are difficult to combine to obtain a mean- 
ing for phrases that contain them. One po- 
tential way to combine prototypes — fuzzy 
set theory (Zadeh, 1965) — proved vulner- 
able to a range of counterexamples (Osh- 
erson & Smith, 1981, 1982). In general, the 
prototypes of constituent concepts can differ 
from the prototypes of their combinations in 
unpredictable ways (Fodor, 1994). The pro- 
totype of BIRDS THAT ARE PETS (per 
haps a parakeet-like bird) may differ from 
the prototypes of both BIRDS and PETS 
(see Storms, de Boeck, van Mechelen, & 
Ruts, 1998, for related evidence). Thus, if 
word meanings are prototypes, it is hard to 
see how the meaning of phrases could be 
a compositional function of the meaning of 
their parts. 

Other early theories proposed that cate- 
gory representations consist of descriptions 
of exemplars of the category in question. 
For example, the mental representation of 
DAISY would include descriptions of spe- 
cific daisies that an individual had encoded 
(eg., Hintzman, 1986; Medin & Schaf- 
fer, 1978; Nosofsky, 1986). However, these 
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own (see Rips, 1995). For example, if by 
chance the only Nebraskans you have met 
are chiropractors and the only chiroprac- 
tors you have met are Nebraskans, then ex- 
emplar models appear to mispredict that 
“Nebraskan” and “chiropractor” will be syn- 
onyms for you. 

To recap briefly, we have found that ex- 
perimental research on concepts and cate- 
gories was largely unable to confirm that 
global memory organization (as in Quillian’s 
semantic memory) conferred word meaning. 
In addition, neither the global theories that 
initiated this research nor the local proto- 
type or exemplar theories that this research 
produced were able to provide insight into 
the basic semantic problem of how we un- 
derstand the meaning of novel sentences. 
This left semantic memory theory in the un- 
enviable position of being unable to explain 
either semantics or memory. 


Functions and Findings 


Current research in this field still focuses 
on categorization and communication, but 
without the benefit of a framework that 
gives a unified explanation for the functions 
that concepts play in categorizing, reasoning, 
learning, language understanding, and mem- 
ory organization. In this section, we survey 
the state of the art, and in the following one, 
we consider the possibility of reuniting some 
of these roles. 


Category Learning and Inference 


One nice aspect of Rosch and Mervis’s 
(1975) studies of typicality effects is that 
they used both natural language categories 
and artificially created categories. Finding 
typicality effects with natural (real world) 
categories shows that the phenomenon is 
of broad interest; finding these same effects 
with artificial categories provides systematic 
control for potentially confounding variables 
(e.g., exemplar frequency) in a way that can- 
not be done for lexical concepts. This general 
strategy linking the natural to the artificial 


decades. Although researchers using artifi- 
cial categories have sometimes been guilty 
of treating these categories as ends in them- 
selves, there are enough parallels between 
results with artificial and natural categories 
that each area of research informs the other 
(see Medin & Coley, 1998, for a review). 


PROTOTYPE VERSUS EXEMPLAR MODELS 


One idea compatible with Rosch’s family re- 
semblance hypothesis is the prototype view. It 
proposes that people learn the characteristic 
features (or central tendency) of categories 
and use them to represent the category 
(e.g., Reed, 1972). This abstract prototype 
need not correspond to any experienced 
example. According to this theory, catego- 
rization depends on similarity to the pro- 
totypes. For example, to decide whether 
some animal is a bird or a mammal, a per- 
son would compare the (representation of) 
that animal to both the bird and the mam- 
mal prototypes and assign it to the cate- 
gory whose prototype it most resembled. 
The prototype view accounts for typicality 
effects in a straightforward manner. Good 
examples have many characteristic proper- 
ties of their category and have few charac- 
teristics in common with the prototypes of 
contrasting categories. 

Early research appeared to provide strik- 
ing confirmation of the idea of prototype ab- 
straction. Using random dot patterns as the 
prototypes, Posner and Keele (1968, 1970) 
produced a category from each prototype. 
The instances in a category were “distor- 
tions” of the prototype generated by mov- 
ing constituent dots varying distances from 
their original positions. Posner and Keele first 
trained participants to classify examples that 
they created by distorting the prototypes. 
Then they gave a transfer test in which they 
presented both the old patterns and new low 
or high distortions that had not appeared 
during training. In addition, the prototypes, 
which the participants had never seen, were 
presented during transfer. Participants had 
to categorize these transfer patterns, but 
unlike the training procedure, the transfer 
test gave participants no feedback about the 
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ther immediately followed training or ap- 
peared after a 1-week delay. 

Posner and Keele (1970) found that cor- 
rect classification of the new patterns de- 
creased as distortion (distance from a cat- 
egory prototype) increased. This is the 
standard typicality effect. The most striking 
result was that a delay differentially affected 
categorization of prototypic versus old train- 
ing patterns. Specifically, correct categoriza- 
tion of old patterns decreased over time to a 
reliably greater extent than performance on 
prototypes. In the immediate test, partici- 
pants classified old patterns more accurately 
than prototypes; however, in the delayed 
test, accuracy on old patterns and proto- 
types was about the same. This differential 
forgetting is compatible with the idea that 
training leaves participants with represen- 
tations of both training examples and ab- 
stracted prototypes but that memory, for 
examples, fades more rapidly than memory 
for prototypes. The Posner and Keele results 
were quickly replicated by others and con- 
stituted fairly compelling evidence for the 
prototype view. 

However, this proved to be the begin- 
ning of the story rather than the end. Other 
researchers (e.g., Brooks, 1978; Medin & 
Schaffer, 1978) put forth an exemplar view of 
categorization. Their idea was that memory 
for old exemplars by itself could account for 
transfer patterns without the need for posit- 
ing memory for prototypes. On this view, 
new examples are classified by assessing their 
similarity to stored examples and assigning 
the new example to the category that has the 
most similar examples. For instance, some 
unfamiliar bird (e.g., a heron) might be cor 
rectly categorized as a bird not because it is 
similar to a bird prototype, but rather be- 
cause it is similar to flamingos, storks, and 
other shore birds. 

In general, similarity to prototypes and 
similarity to stored examples will tend to 
be highly correlated (Estes, 1986). Nonethe- 
less, for some category structures and for 
some specific exemplar and prototype mod- 
els, it is possible to develop differential pre- 
dictions. Medin and Schaffer (1978), for ex- 


against high similarity to particular train- 
ing examples and found that categorization 
was more strongly influenced by the latter. 
A prototype model would make the oppo- 
site prediction. 

Another contrast between exemplar and 
prototype models revolves around sensitiv- 
ity to within-category correlations (Medin, 
Altom, Edelson, & Freko, 1982). A proto- 
type representation captures what is on av- 
erage true of a category, but is insensitive 
to within-category feature distributions. For 
example, a bird prototype could not repre- 
sent the impression that small birds are more 
likely to sing than large birds (unless one 
had separate prototypes for large and small 
birds). Medin et al. (1982) found that people 
are sensitive to within-category correlations 
(see also Malt & Smith, 1984, for corre- 
sponding results with natural object cate- 
gories). Exemplar theorists were also able 
to show that exemplar models could readily 
predict other effects that originally appeared 
to support prototype theories — differen- 
tial forgetting of prototypes versus train- 
ing examples, and prototypes being catego- 
rized as accurately or more accurately than 
training examples. In short, early skirmishes 
strongly favored exemplar models over pro- 
totype models. Parsimony suggested no need 
to posit prototypes if stored instances could 
do the job. Since the early 1980s, there have 
been a number of trends and developments 
in research and theory with artificially con- 
structed categories, and we give only the 
briefest of summaries here. 


NEW MODELS 


There are now more contending models for 
categorizing artificial stimuli, and the early 
models have been extensively elaborated. 
For example, researchers have generalized 
the original Medin and Schaffer (1978) ex- 
emplar model to handle continuous dimen- 
sions (Nosofsky, 1986), to address the time 
course of categorization (Lamberts, 1995; 
Nosofsky & Palmeri, 1997a; Palmeri, 1997), 
to generate probability estimates in infer- 
ence tasks (Juslin & Persson, 2002), and 
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chke, 1992). 

Three new kinds of classification theories 
have been added to the discussion: ration- 
al approaches, decision-bound models, and 
neural network models. Anderson (1990, 
1991) proposed that an effective approach 
to modeling cognition in general and catego- 
rization in particular is to analyze the infor- 
mation available to a person in the situation 
of interest and then to determine abstractly 
what an efficient, if not optimal, strategy 
might be. This approach has led to some new 
sorts of experimental evidence (e.g., Ander- 
son & Fincham, i996; Clapper & Bower, 
2002) and pointed researchers more in the 
direction of the inference function of cate- 
gories. Interestingly, the Medin and Schaf- 
fer exemplar model corresponds to a spe- 
cial case of the rational model, and Nosofsky 
(1991) discussed the issue of whether the 
rational model adds significant explanatory 
power. However, there is also some evidence 
undermining the rational model’s predic- 
tions concerning inference (e.g., Malt, Ross, 
& Murphy, 1995; Murphy & Ross, 1994; 
Palmeri, 1999; Ross & Murphy, 1996). 

Decision-bound models (e.g., Ashby & 
Maddox, 1993; Maddox & Ashby, 1993) 
draw their inspiration from psychophysics 
and signal detection theory. Their primary 
claim is that category learning consists of 
developing decision bounds around the cat- 
egory that will allow people to categorize 
examples successfully. The closer an item is 
to the decision bound the harder it should 
be to categorize. This framework offers a 
new perspective on categorization in that it 
may lead investigators to ask questions such 
as How do the decision bounds that hu- 
mans adopt compare with what is optimal? 
and What kinds of decision functions are 
easy or hard to acquire? Researchers have 
also directed efforts to distinguish decision- 
bound and exemplar models (e.g., Maddox, 
1999; Maddox & Ashby, 1998; McKinley & 
Nosofsky, 1995; Nosofsky, 1998; Nosofsky 
& Palmeri, 19976). One possible difficulty 
with decision-bound models is that they 
contain no obvious mechanism by which 
stimulus familiarity can affect performance, 


(Verguts, Storms, & Tuerlinckx, 2001). 
Neural network or connectionist models 
are the third type of new model on the 
scene (see Knapp & Anderson, 1984, and 
Kruschke, 1992, for examples, and Doumas 
& Hummel, Chap. 4, for further discussion 
of connectionism). It may be a mistake to 
think of connectionist models as compris- 
ing a single category because they take many 
forms, depending on assumptions about hid- 
den units, attentional processes, recurrence, 
and the like. There is one sense in which 
neural network models with hidden units 
may represent a clear advance on proto- 
type models: They can form prototypes in 
a bottom-up manner that reflects within- 
category structure (eg., Love, Medin, & 
Gureckis, 2004). That is, if a category com- 
prises two distinct clusters of examples, net- 
work models can create a separate hidden 
unit for each chunk (eg., large birds versus 
small birds) and thereby show sensitivity to 
within-category correlations. 


MIXED MODELS AND MULTIPLE 
CATEGORIZATION SYSTEMS 

A common response to hearing about var- 
ious models of categorization is to suggest 
that all the models may be capturing im- 
portant aspects of categorization and that 
research should determine in which con- 
texts one strategy versus another is likely 
to dominate. One challenge to this divide 
and conquer program is that the predic- 
tions of alternative models tend to be highly 
correlated, and separating them is far from 
trivial. Nonetheless, there is both empiri- 
cal research (e.g., Johansen & Palmeri, 2002; 
Nosofsky, Clark, & Shin, 1989; Reagher & 
Brooks, 1993) and theoretical modeling that 
support the idea that mixed models of cat- 
egorization are useful and perhaps neces- 
sary. Current efforts combine rules and ex- 
amples (e.g., Erickson & Kruschke, 1998; 
Nosofsky, Palmeri, & McKinley, 1994), as 
well as rules and decision bounds (Ashby, 
Alfonso-Reese, Turken, & Waldron, 1998). 
Some models also combine exemplars and 
prototypes (e.g., Homa, Sterling, & Trepel, 
1981; Minda & Smith, 2001; Smith & Minda, 
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but it remains controversial whether the ad- 
dition of prototypes is needed (e.g., Buse- 
meyer, Dewey, & Medin, 1984; Nosofsky 
& Johansen, 2000; Nosofsky & Zaki, 2002; 
Stanton, Nosofsky, & Zaki, 2002). 

The upsurge of cognitive neuroscience 
has reinforced the interest in multiple mem- 
ory systems. One intriguing line of research 
by Knowlton, Squire, and associates (Knowl- 
ton, Mangels, & Squire, 1996; Knowlton & 
Squire, 1993; Squire & Knowlton, 1995) fa- 
voring multiple categorization systems in- 
volves a dissociation between categoriza- 
tion and recognition. Knowlton and Squire 
(1993) used the Posner and Keele dot pattern 
stimuli to test amnesic and matched con- 
trol patients on either categorization learn- 
ing and transfer or a new-old recognition 
task (involving five previously studied pat- 
terns versus five new patterns). The amne- 
siacs performed very poorly on the recog- 
nition task but were not reliably different 
from control participants on the categoriza- 
tion task. Knowlton and Squire took this as 
evidence for a two-system model, one based 
on explicit memory for examples and one 
based on an implicit system (possibly pro- 
totype abstraction). On this view, amnesiacs 
have lost access to the explicit system but 
can perform the classification task using their 
intact implicit memory. 

These claims have provoked a number of 
counterarguments. First, Nosofsky and Zaki 
(1998) showed that a single system (exem- 
plar) model could account for both types 
of data from both groups (by assuming the 
exemplar-based memory of amnesiacs was 
impaired but not absent). Second, investi- 
gators have raised questions about the de- 
tails of Knowlton and Squire’s procedures. 
Specifically, Palmeri and Flanery (1999) sug- 
gested that the transfer tests themselves 
may have provided cues concerning cate- 
gory membership. They showed that un- 
dergraduates who had never been exposed 
to training examples (the students believed 
they were being shown patterns sublimi- 
nally) performed above chance on trans- 
fer tests in this same paradigm. The debate 
is far from resolved, and there are strong 


ple systems view (e.g., Filoteo, Maddox, & 
Davis, 2001; Maddox, 2002; Nosofsky & Jo- 
hansen, 2000; Palmeri & Flanery, 2002; Re- 
ber, Stark, & Squire, 1998a, 1998b). It is safe 
to predict that this issue will receive continu- 
ing attention. 


INFERENCE LEARNING 


More recently, investigators have begun to 
worry about extending the scope of cate- 
gory learning studies by looking at inference. 
Often, we categorize some entity to help 
us accomplish some function or goal. Ross 
(1997, 1999, 2000) showed that the category 
representations people develop in laboratory 
studies depend on use and that use affects 
later categorization. In other words, models 
of categorization ignore inference and use at 
their peril. Other work suggests that hav- 
ing a cohesive category structure is more 
important for inference learning than it is 
for classification (Yamauchi, Love, & Mark- 
man, 2002; Yamauchi & Markman, 1998, 
2000<, 2000b; for modeling implications see 
Love, Markman, & Yamauchi, 2000; Love 
et al., 2004). More generally, this work raises 
the possibility that diagnostic rules based on 
superficial features, which appear so promi- 
nently in pure categorization tasks, may not 
be especially relevant for contexts involv- 
ing multiple functions or more meaning- 
ful stimuli (e.g., Markman & Makin, 1998; 
Wisniewski & Medin, 1994). 


FEATURE LEARNING 


The final topic on our “must mention” list 
for work with artificial categories is feature 
learning. It is a common assumption in both 
models of object recognition and category 
learning that the basic units of analysis or 
features remain unchanged during learning. 
There is increasing evidence and supporting 
computational modeling that indicate this 
assumption is incorrect. Learning may in- 
crease or decrease the distinctiveness of fea- 
tures and may even create new features (see 
Goldstone, 1998, 2003; Goldstone, Lippa, & 
Shriffin, 2001; Goldstone & Stevyers, 2001; 
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& Rodet, 1997). 

Feature learning has important implica- 
tions for our understanding of the role of 
similarity in categorization. It is intuitively 
compelling to think of similarity as a causal 
factor supporting categorization — things be- 
long to the same category because they are 
similar. However, this may have things back- 
ward. Even standard models of categoriza- 
tion assume learners selectively attend to 
features that are diagnostic, and the work on 
feature learning suggests that learners may 
create new features that help partition ex- 
amples into categories. In that sense, similar- 
ity (in the sense of overlap in features) is the 
by-product, not the cause, of category learn- 
ing. We take up this point again in discussing 
the theory theory of categorization later in 
this review. 


REASONING 


As we noted earlier, one of the central func- 
tions of categorization is to support reason- 
ing. Having categorized some entity as a 
bird, one may predict with reasonable con- 
fidence that it builds a nest, sings, and can 
fly, although none of these inferences is cer- 
tain. In addition, between-category relations 
may guide reasoning. For example, from the 
knowledge that robins have some enzyme in 
their blood, one is likely to be more confi- 
dent that the enzyme is in sparrows than in 
raccoons. The basis for this confidence may 
be that robins are more similar to sparrows 
than to raccoons or that robins and sparrows 
share a lower-rank superordinate category 
than do robins and raccoons (birds versus 
vertebrates). We do not review this literature 
here because Sloman and Lagnado (Chap. 5) 
summarize it nicely. 


SUMMARY 


Bowing to practicalities, we have glossed a 
lot of research and skipped numerous other 
relevant studies. The distinction between ar- 
tificially created and natural categories is 
itself artificial — at least in the sense that 
it has no clear definition or marker. When 
we take up the idea that concepts may be 


some laboratory studies that illustrate this 
fuzzy boundary. For the moment, however, 
we shift attention to the more language-like 
functions of concepts. 


Language Functions 


Most investigators in the concepts and cat- 
egories area continue to assume that, in ad- 
dition to their role in recognition and cat- 
egory learning, concepts also play a role in 
understanding language and in thinking dis- 
cursively about things. In addition to de- 
termining, for example, which perceptual 
patterns signal the appearance of a daisy, 
the DAISY concept also contributes to the 
meaning of sentences such as our earlier 
example, “Fred placed a daisy in a lunch- 
box.” We noted that early psychological re- 
search on concepts ran into problems in 
explaining the meaning of linguistic units 
larger than single words. Most early theories 
posited representations, such as networks, 
exemplars, or prototypes, that did not com- 
bine easily and, thus, complicated the prob- 
lem of sentence meaning. Even if we reject 
the idea that sentence meanings are compo- 
sitional functions of word meaning, we still 
need a theory of sentence meanings, and no 
obvious contenders are in sight. In this sec- 
tion, we return to the role that concepts play 
in language understanding to see whether 
new experiments and theories have clarified 
this relationship. 


CONCEPTS AS POSITIONS IN MEMORY STRUCTURES 


One difficulty with the older semantic mem- 
ory view of word meaning is that memory 
seems to change with experience from one 
person to another, whereas meaning must 
be more or less constant. The sentences 
you have encoded about daisies may differ 
drastically from those we have encoded be- 
cause your conversation, reading habits, and 
other verbal give and take can diverge in 
important ways from ours. If meaning de- 
pends on memory for these sentences, then 
your meaning for “daisy” should likewise 
differ from ours. This raises the question 
of how you could possibly understand the 
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tend or how you could meaningfully dis- 
agree with us about some common topic (see 
Fodor, 1994). 

It is possible that two people — say, 
Calvin and Martha — might be able to main- 
tain mutual intelligibility as long as their 
conceptual networks are not too different. 
It is partly an empirical question as to 
how much their networks can vary while 
still allowing Calvin’s concepts to map cor 
rectly into Martha’s. To investigate this issue, 
Goldstone and Rogosky (2002) carried out 
some simulations that try to recover such a 
mapping. The simulations modeled Calvin's 
conceptual system as the distance between 
each pair of his concepts (e.g., the distance 
between DOG and CAT in Calvin’s system 
might be one unit, whereas the distance be- 
tween DOG and DAISY might be six units). 
Martha’s conceptual system was represented 
in the same way (i.e., by exactly the same 
interconcept distances) except for random 
noise that Goldstone and Rogosky added to 
each distance to simulate the effect of dis- 
parate beliefs. A constraint-satisfaction algo- 
rithm then applied to Calvin’s and Martha's 
systems that attempted to recover the origi- 
nal correspondence between the concepts — 
to map Calvin’s DOG to Martha’s DOG, 
Calvin’s DAISY to Martha’s DAISY, and so 
on. The results of the stimulations show that 
with 15 concepts in each system (the max- 
imum number considered and the case in 
which the model performed best) and with 
no noise added to Martha’s system, the algo- 
rithm was always able to find the correct cor- 
respondence. When the simulation added to 
each dimension of the interconcept distance 
in Martha a small random increment (drawn 
from a normal distribution with mean o and 
standard deviation equal to .oo4 times the 
maximum distance), the algorithm recov- 
ered the correspondence about 63% of the 
time. When the standard deviation increased 
to .006 times the maximum distance, the al- 
gorithm succeeded about 15% of the time 
(Goldstone & Rogosky, 2002, Figure 2). 

What should one make of the Goldstone 
and Rogosky results? Correspondences may 
be recovered for small amounts of noise, 


larger amounts of noise. Foes of the meaning- 
as-relative-position theory might claim that 
the poor performance under the .6% noise 
condition proves their contention. Advo- 
cates would point to the successful part of 
the simulations and note that their ability to 
detect correct correspondences usually im- 
proved as the number of points increased (al- 
though there are some nonmonotonicities in 
the simulation results that qualify this find- 
ing). Clearly, this is only the beginning of the 
empirical side of the debate. For example, 
the differences between Martha and Calvin 
are likely to be not only random, but also 
systematic, as in the case in which Martha 
grew up on a farm and Calvin was a city kid. 


CONCEPT COMBINATION 


Let’s look at attempts to tackle head-on the 
problem of how word-level concepts com- 
bine to produce the meanings of larger lin- 
guistic units. There is relatively little re- 
search in this tradition on entire sentences 
(see Conrad & Rips, 1986; Rips, Smith, & 
Shoben, 1978), but there has been a fairly 
steady research stream devoted to noun 
phrases, including adjective-noun (“edible 
flowers”), noun-noun (“food flowers”), and 
noun-relative clause combinations (“flowers 
that are foods”). We'll call the noun or ad- 
jective parts of these phrases components and 
distinguish the main or head noun (“flowers” 
in each of our examples) from the adjective 
or noun modifier (“edible” or “food”). The 
aim of the research in question is to describe 
how people understand these phrases and, in 
particular, how the typicality of an instance 
in these combinations depends on the typ- 
icality of the same instance in the compo- 
nents. How does the typicality of a marigold 
in the category of edible flowers depend on 
the typicality of marigolds in the categories 
of edible things and flowers? As we already 
noticed, this relationship is far from straight- 
forward (parakeets are superbly typical as 
pet birds but less typical pets and even less 
typical birds). 

There is an optimistic way of looking at 
the results of this research program and a 
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mostly optimistic, reviews of this work, see 
Hampton, 1997; Murphy, 2002; Rips, 1995; 
and Wisniewski, 1997). The optimistic angle 
is that interesting phenomena have turned 
up in investigating the typicality structure of 
combinations. The pessimistic angle, which 
is a direct result of the same phenomena, is 
that little progress has been made in figuring 
out a way to predict the typicality of a com- 
bination from the typicality of its compo- 
nents. This difficulty is instructive — in part 
because all psychological theories of concept 
combination posit complex, structured rep- 
resentations, and they depict concept combi- 
nation either as rearranging (or augmenting) 
the structure of the head noun by means of 
the modifier (Franks, 1995; Murphy, 1988; 
Smith, Osherson, Rips, & Keane, 1988) or as 
fitting both head and modifier into a larger 
relational complex (Gagné & Shoben, 1997). 
Table 3.1 summarizes what is on offer from 
these theories. Earlier models (at the top of 
the table) differ from later ones mainly in 
terms of the complexity of the combination 
process. Smith et al. (1988), for example, 
aimed at explaining simple adjective-noun 
combinations (e.g., “white vegetable”) that, 
roughly speaking, refer to the intersection 
of the sets denoted by modifier and head 
(white vegetables are approximately the in- 
tersection of white things and vegetables). 
In this theory, combination occurs when the 
modifier changes the value of an attribute 
in the head noun (changing the value of the 
color attribute in VEGETABLE to WHITE) 
and boosts the importance of this attribute in 
the overall representation. Later theories at- 
tempted to account for nonintersective com- 
binations (e.g., “criminal lawyers,” who are 
often not both criminals and lawyers). These 
combinations call for more complicated ad- 
justments — for example, determining a rela- 
tion that links the modifier and head (a crim- 
inal lawyer is a lawyer whose clients are in 
for criminal charges) or extracting a value 
from the modifier that can then be assigned 
to the head (e.g., a panther lawyer might be 
one who is especially vicious or tenacious). 
So why no progress? One reason is that 
many of the combinations that investiga- 


have familiar referents. Some people have 
experience with edible flowers, for example, 
and know that they include nasturtiums, are 
sometimes used in salads, are often brightly 
colored, are peppery tasting, and so on. We 
learn many of these properties by direct 
or indirect observation (by what Hampton, 
1987, called “extensional feedback”), and 
they are sometimes impossible to learn sim- 
ply by knowing the meaning of “edible” and 
“flower.” Because these properties can affect 
the typicality of potential instances, the typi- 
cality of these familiar combinations will not 
be a function of the typicality of their com- 
ponents. This means that if we are going to 
be able to predict typicality in a composi- 
tional way, we will have to factor out the 
contribution of these directly acquired prop- 
erties. Rips (1995) refered to this filtering as 
the “no peeking principle” — no peeking at 
the referents of the combination. Of course, 
you might be able to predict typicality if 
you already know the relevant real-world 
facts in addition to knowing the meaning 
of the component concepts. The issue about 
understanding phrases, however, is how we 
are able to interpret an unlimited number 
of new ones. For this purpose, people need 
some procedure for computing new mean- 
ings from old ones that is not restricted by 
the limited set of facts they happened to 
have learned (e.g., through idiosyncratic en- 
counters with edible flowers). 

Another reason for lack of progress is 
that some of the combinations used in 
this research may be compounds or lexi- 
calized phrases [e.g., “White House” (ac- 
cent on “White”) = the residence of the 
President] rather than modifier-head con- 
structions [e.g., “white house” (accent on 
“house”) = a house whose color is white]. 
Compounds are often idiomatic; their mean- 
ing is not an obvious function of their 
parts (see Gleitman & Gleitman’s, 1970, 
distinction between phrasal and compound 
constructions; and Partee, 1995). 

There is a deeper reason, however, for 
the difficulty in predicting compound typ- 
icality from component typicality. Even if 
we adhere to the no peeking principle and 
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Table 3.1 Peele srtaktetis fnttpsr/4pEtiUibiaegtOOm 


Representation 
Model Domain of Head Noun Modification Process 
Hampton (1987) Noun-Noun and Schemas Modifier and head 


Noun-Relative- 
Clause NPs 
(conjunctive NPs, 
e.g., sports that are 
also games) 

Simple 
Adjective-Noun NPs 
(e.g., red apple) 


Smith, Osherson, Rips, 
& Keane (1988) 


Murphy (1988) Adj-Noun and 
Noun-Noun NPs 
(esp. non- 
predicating NPs, e.g., 
corporate 
lawyer) 

Adj-Noun and 
Noun-Noun NPs 
(esp. privatives, e.g., 


fake gun) 


Franks (1995) 


Gagné & Shoben Noun-Noun NPs 


(1997) 


Wisniewski (1997) Noun-Noun NPs 


(attribute-value lists 
with attributes 


(attribute-value lists 
with distributions of 
values and weighted 


(attribute-value 
structures with 
default values for 
some attributes) 

Lexical representations 
containing 
distributions of 
relations in which 
nouns figure 

Schemas (lists of slots 1. Modifier noun is bound 
and fillers, including 
roles in relevant 
events) 


contribute values to 
combination on the 


varying in basis of importance and 
importance) centrality 
Schemas Adjective shifts value on 


relevant attribute in 
head and increases 
weight on relevant 


attributes) dimension 
Schemas (lists of slots | Modifier fills relevant slot; 
and fillers) then representation is 
“cleaned up” on the 
basis of world 
knowledge 
Schemas Attribute-values of 


modifier and head are 
summed with modifier 
potentially overriding 
or negating head values 
Nouns are bound as 

arguments to relations 
(e.g., flu virus = virus 
causing flu) 


to role in head noun 
(e.g., truck soap = 
soap for cleaning 
trucks) 

2. Modifier value is 
reconstructed in head 
noun (e.g., zebra 
clam = clam with 
stripes) 

3. Hybridization (eg., 
robin canary = cross 
between robin and 
canary) 


stick to clear modifier-head constructions, 
the typicality of a combination can depend 
on “emergent” properties that are not part 
of the representation of either component 
(Hastie, Schroeder, & Weber, 1990; Kunda, 
Miller, & Claire, 1990; Medin & Shoben, 
1988; Murphy, 1988). For example, you may 
never have encountered, or even thought 


about, a smoky apple (so extensional feed- 
back does not inform your conception of the 
noun phrase), but nevertheless it is plausi- 
ble to suppose that smoky apples are not 
good tasting. Having a bad taste, however, 
is not a usual property of (and is not likely 
to be stored as part of a concept for) ei- 
ther apples or smoky things; on the contrary, 
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meats, cheese, fish) are often quite good tast- 
ing. If you agree with our assessment that 
smoky apples are likely to be bad tasting, 
that is probably because you imagine a way 
in which apples could become smoky (being 
caught in a kitchen fire, perhaps) and you in- 
fer that under these circumstances the apple 
would not be good to eat. The upshot is that 
the properties of a combination can depend 
on complex inductive or explanatory infer- 
ences (Johnson & Keil, 2000; Kunda et al., 
1990). If these properties affect the typical- 
ity of an instance with respect to the com- 
bination, then there is little hope of a sim- 
ple model of this phenomenon. No current 
theory comes close to providing an adequate 
and general account of these processes. 


INFERENTIAL VERSUS ATOMISTIC CONCEPTS 


Research on the typicality structure of noun 
phrases is of interest for what it can tell 
us about people’s inference and problem- 
solving skills. However, because these pro- 
cesses are quite complex — drawing on gen- 
eral knowledge and inductive reasoning to 
produce emergent information — we can not 
predict noun phrase typicality in other than 
a limited range of cases. For much the same 
reason, typicality structure does not appear 
very helpful in understanding how people 
construct the meaning of a noun phrase 
while reading or listening. By themselves, 
emergent properties do not rule out the pos- 
sibility of a model that explains how people 
derive the meaning of a noun phrase from 
the meaning of its components. Composi- 
tionality does not require that all aspects 
of the noun phrase’s meaning are parts of 
the components’ meanings. It is sufficient 
to find some computable function from the 
components to the composite that is simple 
enough to account for people’s understand- 
ing (see Partee, 1995, for a discussion of types 
of composition). The trouble is that if noun 
phrases’ meanings require theory construc- 
tion and problem solving, such a process is 
unlikely to explain the ease and speed with 
which we usually understand them in ongo- 
ing speech. 


role of schemas or prototypes in concept 
combination, but it is worth noting that 
many of the same problems with semantic 
composition affect other contemporary the- 
ories, such as latent semantic analysis (Lan- 
dauer & Dumais, 1997), which take a global 
approach to meaning. Latent semantic 
analysis takes as input a table of the fre- 
quencies with which words appear in spe- 
cific contexts. In one application, for exam- 
ple, the items comprise about 60,000 word 
types taken from 30,000 encyclopedia en- 
tries, and the table indicates the frequency 
with which each word appears in each entry. 
The analysis then applies a technique similar 
to factor analysis to derive an approximately 
300-dimensional space in which each word 
appears as a point and in which words that 
tend to co-occur in context occupy neigh- 
boring regions in the space. Because this 
technique finds a best fit to a large corpus 
of data, it is sensitive to indirect connections 
between words that inform their meaning. 
However, the theory has no clear way to 
derive the meaning of novel sentences. Al- 
though latent semantic analysis could rep- 
resent a sentence as the average position of 
its component words, this would not allow 
it to capture the difference between, say, 
The financier dazzled the movie star versus 
The movie star dazzled the financier, which 
depend on sentence structure. In addition, 
the theory uses the distance between two 
words in semantic space to represent the re- 
lation between them, and so the theory has 
trouble with semantic relations that, unlike 
distances, are asymmetric. It is unclear, for 
example, how it could cope with the fact 
that father implies parent but parent does not 
imply father. 

On the one hand, online sentence un- 
derstanding is a rapid, reliable process. On 
the other hand, the meaning of even sim- 
ple adjective-noun phrases seems to re- 
quire heady inductive inferences. Perhaps 
we should distinguish, then, between the 
interpretation of a phrase or sentence and 
its comprehension (Burge, 1999). On this 
view, comprehension gives us a more or less 
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based primarily on the word meaning of the 
components and syntactic/semantic struc- 
ture. Interpretation, by contrast, is a po- 
tentially unlimited process relying on the 
result of comprehension plus inference 
and general knowledge. The comprehen- 
sion/interpretation distinction may be more 
of a continuum than a dichotomy, but the 
focus on the interpretation end of the con- 
tinuum means that research on concepts is 
difficult to apply to comprehension. As we 
have just noticed, it is hard, if not impossi- 
ble, to compute the typicality structure of 
composites. So if we want something read- 
ily computable in order to account for com- 
prehension, we have to look to something 
simpler than typicality structures (and the 
networks, prototypes, schemas, or theories 
that underlie them). One possibility (Fodor, 
1994, 1998) is to consider a representation 
in which word meanings are mental units 
not much different from the words them- 
selves, and whose semantic values derive 
from (unrepresented) causal connections to 
their referents. 


GENERIC NOUN PHRASES 


Even if we abandon typicality structures as 
accounts of comprehension, however, it does 
not follow that these structures are use- 
less in explaining all linguistic phenomena. 
More recent research on two fronts seems 
to us to hold promise for interactions be- 
tween psychological and linguistic theories. 
First, there are special constructions in En- 
glish that, roughly speaking, describe default 
characteristics of members of a category. For 
example, “Lions have manes” means (ap- 
proximately) that having a mane is a char 
acteristic property of lions. Bare plural noun 
phrases (i.e, plurals with no preceding de- 
terminers) are one way to convey such a 
meaning as we have just noticed, but indefi- 
nite singular sentences (“A lion has a mane”) 
and definite singular sentences (“The lion — 
Panthera leo — has a mane”) can also convey 
the same idea in some of their senses. These 
generic sentences seem to have normative 
content. Unlike “Most lions have manes,” 


existence of numerous exceptions; “Lions 
have manes” seems to be true even though 
most lions (e.g., female and immature lions) 
do not have manes (see Krifka et al., 1995, 
for an introduction to generic sentences). 
There is an obvious relation between the 
truth or acceptability of generic sentences 
and the typicality structure of categories be- 
cause the typical properties of a category 
are those that appear in true generic sen- 
tences. Of course, as Krifka et al. noted, this 
may simply be substituting one puzzle (the 
truth conditions of generic sentences) for an- 
other (the nature of typical properties), but 
this may be one place where linguistic and 
cognitive theories might provide mutual in- 
sight. Research by Susan Gelman and her 
colleagues (see Gelman, 2003, for a thor 
ough review) suggests that generic sentences 
are a frequent way for caregivers to convey 
category information to children. Four-year- 
olds differentiate sentences with bare plurals 
(“Lions have manes”) from those explicitly 
quantified by “all” or “some” in comprehen- 
sion, production, and inference tasks (Gel- 
man, Star, & Flukes, 2002; Hollander, Gel- 
man, & Star, 2002). It would be of interest 
to know, however, at what age, and in what 
way, children discriminate generics from ac- 
cidental generalizations — for example, when 
they first notice the difference between “Li- 
ons have manes” and “Lions frequently have 
manes” or “Most lions have manes.” 


POLYSEMY 


A second place to look for linguistic-cogni- 
tive synergy is in an account of the mean- 
ings of polysemous words. Linguists (eg., 
Lyons, 1977, Chap. 13) traditionally distin- 
guish homonyms such as “mold,” which have 
multiple unrelated meanings (e.g., a form 
into which liquids are poured vs. a fungus), 
from polysemous terms such as “line,” which 
have multiple related meanings (e.g., a geo- 
metric line vs. a fishing line vs. a line of peo- 
ple, etc.). What makes polysemous terms in- 
teresting to psychologists in this area is that 
the relations among their meanings often 
possess a kind of typicality structure of their 
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expression rather than the typicality of the 
referents of the expression and is thus a type 
of higher-level typicality phenomenon. Fig- 
ure 3.2 illustrates such a structure for the 
polysemous verb “crawl,” as analyzed by 
Fillmore and Atkins (2000). A rectangle 
in the figure represents each sense or use 
and includes both a brief label indicat- 
ing its distinctive property and an exam- 
ple from British corpuses. According to 
Fillmore and Atkins, the central meanings 
for crawl have to do with people or crea- 
tures moving close to the ground (these 
uses appear in rectangles with darker out- 
lines in the figure). But there are many 
peripheral uses — for example, time mov- 
ing slowly (“The hours seemed to crawl 
by”) and creatures teeming about (“The pic- 
nic supplies crawled with ants”). The cen- 
tral meanings are presumably the original 
ones with the peripheral meanings derived 
from these by a chaining process. Malt, 
Sloman, Gennari, Shi, and Wang (1999) 
observed similar instances of chaining 
in people’s naming of artifacts, such as bot- 
tles and bowls, and it is possible that the 
gerrymandered naming patterns reflect the 
polysemy of the terms (e.g., “bottle”) rather 
than different uses of the same meaning. As 
Figure 3.2 shows, it is not easy to distinguish 
different related meanings (polysemy) from 
different uses of the same meaning (contex- 
tual variation) and from different unrelated 
meanings (homonymy). 

Some research has attacked the issue of 
whether people store each of the separate 
senses of a polysemous term (Klein & Mur 
phy, 2002) or store only the core mean- 
ing, deriving the remaining senses as needed 
for comprehension (Caramazza & Grober, 
1976; Franks, 1995). Conflicting evidence 
in this respect may be due to the fact that 
some relations between senses seem rela- 
tively productive and derivable (regular pol- 
ysemy, such as the relationship between 
terms for animals and their food products, 
e.g., the animal meaning of “lamb” and its 
menu meaning), whereas other senses seem 
ad hoc (e.g., the relation between “crawl” = 
moving close to the ground and “crawl” = 


mechanisms are likely to be at work here. 


SUMMARY 


We do not mean to suggest that the only lin- 
guistic applications of psychologists’ “con- 
cepts” are in dealing with interpretation, 
generic phrases, and polysemy — far from 
it. There are many areas, especially in de- 
velopmental psycholinguistics, that hold the 
promise of fruitful interactions but that we 
cannot review here. Nor are we suggesting 
that investigators in this area give up the at- 
tempt to study the use of concepts in im- 
mediate comprehension. However, concepts 
for comprehension seem to have different 
properties from the concepts that figure in 
the other functions we have discussed, and 
researchers need to direct more attention to 
the interface between them. 


Theories, Modules, and 
Psychometaphysics 


We have seen, so far, some downward pres- 
sure on cognitive theories to portray human 
concepts as mental entities that are as simple 
and streamlined as possible. This pressure 
comes not only from the usual goal of par- 
simony but also from the role that concepts 
play in immediate language comprehension. 
However, there is also a great deal of upward 
pressure — pressure to include general knowl- 
edge about a category as part of its represen- 
tation. For example, the presence of emer- 
gent properties in concept combinations 
suggests that people use background knowl- 
edge in interpreting these phrases. Similarly, 
people may bring background knowledge 
and theories to bear in classifying things even 
when they know a decision rule for the cat- 
egory. Consider psychodiagnostic classifica- 
tion. Although DSM-IV (the official diag- 
nostic manual of the American Psycholog- 
ical Association) is atheoretical and orga- 
nized in terms of rules, there is clear evidence 
that clinicians develop theories of disorders 
and, contra DSM-IV, weight causally cen- 
tral symptoms more than causally periph- 
eral symptoms (e.g., Kim & Ahn, 2002a). 
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Figure 3.2. The meanings of crawl: Why it is difficult to distinguish different related meanings 


(polysemy) from different uses of the same meaning (contextual variation) and from different 
unrelated meanings (homonymy). Adapted from Fillmore & Alking (2000) by permission of Oxford 


University Press. 


The same holds for laypersons (e.g., Furn- 
ham, 1995; Kim & Ahn, 2002b). 

In this section, we examine the con- 

sequences of expanding the notion of a 
concept to include theoretical information 
about a category. In the case of the natu- 
ral categories, this information is likely to be 
causal because people probably view physi- 
cal causes as shaping and maintaining these 
categories. For artifacts, the relevant infor- 
mation may be the intentions of the person 
creating the object (e.g., Bloom, 1996). The 
issues we raise here concern the content and 
packaging of these causal beliefs. 

The first of these issues focuses on 
people’s beliefs about the locus of these 
causal forces — what we called “psychometa- 
physics.” At one extreme, people may be- 
lieve that each natural category is associated 
with a single source, concentrated within a 


category instance, that controls the nature 
of that instance. The source could deter- 
mine, among other things, the instance’s typ- 
ical properties, its category membership, and 
perhaps even the conditions under which 
it comes into and goes out of existence. 
Alternatively, people may believe that the 
relevant causal forces are more like a swarm — 
not necessarily internal to an instance, nor 
necessarily emanating from a unitary spot — 
but shaping the category in aggregate 
fashion. 

The second issue has to do with the cogni- 
tive divisions that separate beliefs about dif- 
ferent sorts of categories. People surely be- 
lieve that the causes that help shape daisies 
differ in type from those that shape teapots. 
Lay theories about flowers and other liv- 

ing things include at least crude informa- 
tion about specifically biological properties, 
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tifacts touch instead on intended and actual 
functions. However, how deep do these di- 
visions go? On the one hand, beliefs about 
these domains could be modular (relatively 
clustered, relatively isolated), innate, uni- 
versal, and local to specific brain regions. 
On the other hand, they may be free float- 
ing, learned, culturally specific, and dis- 
tributed across cortical space. This issue is 
important to us because it ultimately af- 
fects whether we can patch up the “semantic 
memory” marriage. 


Essentialism and Sortalism 


PSYCHOLOGICAL ESSENTIALISM 


What’s the nature of people’s beliefs about 
the causes of natural kinds? One hypothesis 
is that people think there is something inter- 
nal to each member of the kind — an essence — 
that is responsible for its existence, cate- 
gory membership, typical properties, and 
other important characteristics (e.g., Atran, 
1998; Gelman & Hirschfeld, 1999; Medin & 
Ortony, 1989). Of course, it is unlikely that 
people think that all categories of natural 
objects have a corresponding essence. There 
is probably no essence of pets, for example, 
that determines an animal’s pet status. How- 
ever, for basic-level categories, such as dogs 
or gold or daisies, it is tempting to think that 
something in the instance determines cru- 
cial aspects of its identity. Investigators who 
have accepted this hypothesis are quick to 
point out that the theory applies to people’s 
beliefs and not to the natural kinds them- 
selves. Biologists and philosophers of science 
agree that essentialism will not account for 
the properties and variations that real species 
display, in part because the very notion of 
species is not coherent (e.g., Ghiselin, 1981; 
Hull, 1999). Chemical kinds, for example, 
gold, may conform much more closely to 
essentialist doctrine (see Sober, 1980). Nev- 
ertheless, expert opinion is no bar to layper- 
sons’ essentialist views on this topic. In addi- 
tion, psychological essentialists have argued 
that people probably do not have a fully 
fleshed out explanation of what the essence 
is. What they have, on this hypothesis, is an 


something that plays the role of essence even 
though they can not supply a description of 
it (Medin & Ortony, 1989). 

Belief in a hypothetical, minimally de- 
scribed essence may not seem like the sort 
of thing that could do important cognitive 
work, but psychological essentialists have 
pointed out a number of advantages that 
essences might afford, especially to chil- 
dren. The principal advantage may be in- 
duction potential. Medin (1989) suggested 
that essentialism is poor metaphysics but 
good epistemology in that it may lead peo- 
ple to expect that members of a kind will 
share numerous, unknown properties — an 
assumption that is sometimes correct. In 
short, essences have a motivational role to 
play in getting people to investigate kinds’ 
deeper characteristics. Essences also explain 
why category instances seem to run true to 
type — for example, why the offspring of pigs 
grow up to be pigs rather than cows. They 
also explain the normative character of kinds 
(e.g., their ability to support inductive ar- 
guments and their ability to withstand ex- 
ceptions and superficial changes) as well as 
people’s tendency to view terms for kinds as 
well defined. 

Evidence for essentialism tends to be in- 
direct. There are results that show that chil- 
dren and adults do in fact hold the sorts 
of beliefs that essences can explain. By the 
time they reach first or second grade, chil- 
dren know that animals whose insides have 
been removed are no longer animals, that 
baby pigs raised by cows grow up to be 
pigs rather than cows (Gelman & Well- 
man, 1991), and that cosmetic surgery does 
not alter basic-level category membership 
(Keil, 1989). Research on adults also shows 
that “deeper” causes — those that themselves 
have few causes but many effects — tend to 
be more important in classifying than shal- 
lower causes (Ahn, 1998; Sloman, Love, & 
Ahn, 1998). 

However, results like these are evidence 
for essence only if there are no better ex- 
planations for the same results, and it seems 
at least conceivable that children and adults 
make room for multiple types and sources 


CONCEPTS AND CATEGORIES 57 


of causePPileReRteisb hs bebe iAMeBAgC O entialism. Ultimately, the issue boils down 


According to Strevens (2000), for example, 
although people’s reasoning and classifying 
suggest that causal laws govern natural kinds, 
it may be these laws alone, rather than a uni- 
fying essence, that are responsible for the 
findings. According to essentialists, people 
think there is something (an essence) that is 
directly or indirectly responsible for the typ- 
ical properties of a natural kind. According 
to Strevens’ minimalist alternative, people 
think that for each typical property there is 
something that causes it and that something 
may vary for different properties. It is im- 
portant to settle this difference — the pres- 
ence or absence of a unique central cause — 
if only because the essentialist claim is the 
stronger one. 

Essentialists counter that both chil- 
dren and adults assume a causal struc- 
ture consistent with essence (see Braisby, 
Franks, & Hampton, 1996; Diesendruck & 
Gelman, 1999; and Kalish, 1995, 2002, for 
debate on this issue). One strong piece of 
evidence for essentialism is that participants 
who have successfully learned artificial, fam- 
ily resemblance categories (i.e., those in 
which category members have no single fea- 
ture in common) nevertheless believe that 
each category contained a common, defining 
property (Brooks & Wood, as cited by Ahn 
et al., 2001). Other studies with artificial 
“natural” kinds have directly compared es- 
sentialist and nonessentialist structures but 
have turned in mixed results (e.g., Rehder 
& Hastie, 2001). It is possible that explicit 
training overrides people’s natural tendency 
to think in terms of a common cause. 

In the absence of more direct evidence 
for essence, the essentialist-minimalist de- 
bate is likely to continue (see Ahn et al., 
2001; Sloman & Malt, 2003; and Strevens, 
2001, for the latest salvos in this dispute). 
Indeed, the authors of this chapter are not 
in full agreement. Medin finds minimalism 
too unconstrained, whereas Rips opines that 
essentialism suffers from the opposite prob- 
lem. Adding a predisposition toward parsi- 
mony to the minimalist view seems like a 
constructive move, but such a move would 
shift minimalism considerably closer to es- 


to determining to what extent causal under- 
standings are biased toward the assumption 
of a unique, central cause for a category’s 
usual properties. 


SORTALISM 


According to some versions of essential- 
ism, an object’s essence determines not only 
which category it belongs to but also the ob- 
ject’s very identity. According to this view, 
it is by virtue of knowing that Fido is a dog 
that you know (in principle) how to identify 
Fido over time, how to distinguish Fido from 
other surrounding objects, and how to de- 
termine when Fido came into existence and 
when he will go out of it. In particular, if Fido 
happens to lose his dog essence, then Fido 
not only ceases to be a dog, but he also ceases 
to exist entirely. As we noted in discussing 
essentialism, not all categories provide these 
identity conditions. Being a pet, for example, 
doesn’t lend identity to Fido because he may 
continue to survive in the wild as a nonpet. 
According to one influential view (Wiggins, 
1980), the critical identity-lending category 
is the one that answers the question What 
is it? for an object, and because basic-level 
categories are sometimes defined in just this 
way, basic-level categories are the presumed 
source of the principles of identity. (Theo- 
ries of this type usually assume that identity 
conditions are associated with just one cate- 
gory for each object because multiple iden- 
tity conditions lead to contradictions; see 
Wiggins, 1980). Contemporary British phi- 
losophy tends to refer to such categories as 
sortals, however, and we adopt this termi- 
nology here. 

Sortalism plays an important role in cur- 
rent developmental psychology because de- 
velopmentalists have used children’s mas- 
tery of principles of identity to decide 
whether these children possess the associ- 
ated concept. In some well-known studies, 
Xu and Carey (1996) staged for infants a 
scene in which a toy duck appears from one 
side of an opaque screen and then returns be- 
hind it. A toy truck next emerges from the 
other side of the screen and then returns to 
its hidden position. Infants habituate after 


58 THE CAMBRIDGE HANDBOOK OF THINKING AND REASONING 


a number oPeddartstethyusitpevigainkenarg ne dnhese consequences of sortalism may be 


at which time the screen is removed to re- 
veal both the duck and truck (the scene 
that adults expect) or just one of the ob- 
jects (duck or truck). Xu and Carey reported 
that younger infants (e.g., 10-month-olds) 
exhibit no more surprise at seeing one ob- 
ject than at seeing two, whereas older infants 
(and adults) show more surprise at the one- 
object tableau. Xu and Carey also showed in 
control experiments that younger and older 
infants perform identically if they see a pre- 
view of the two starring objects together be- 
fore the start of the performance. The inves- 
tigators infer that the younger infants lack 
the concepts DUCK and TRUCK because 
they are unable to use a principle of identity 
for these concepts to discern that a duck 
cannot turn into a truck while behind the 
screen. Xu and Carey’s experiments have 
sparked a controversy about whether the 
experimental conditions are simple enough 
to allow babies to demonstrate their grip 
on object identity (see Wilcox & Bail- 
largeon, 1998; Xu, 2003), but for present 
purposes what is important is the assump- 
tion that infants’ inability to reidentify ob- 
jects over temporal gaps implies lack of the 
relevant concepts. 

Sortal theories impose strong constraints 
on some versions of essentialism. We noted 
that one of essentialism’s strong points is 
its ability to explain some of the norma- 
tive properties of concepts — for example, 
the role concepts play in inductive infer- 
ences. However, sortalism places some re- 
strictions on this ability. Members of sortal 
categories can not lose their essence without 
losing their existence, even in counterfac- 
tual circumstances. This means that if we are 
faced with a premise such as Suppose dogs can 
bite through wire... , we cannot reason about 
this supposition by assuming the essence of 
dogs has changed in such a way as to make 
dogs stronger. A dog with changed essence 
is not a superdog, according to sortalism, 
but rather has ceased to exist (see Rips, 
2001). For the same reason, it is impossible 
to believe without contradiction both that 
basic-level categories are sortals and that ob- 
jects can shift from one basic-level category 
to another. 


reasonable ones, but it is worth considering 
the possibility that sortalism — however well 
it fares as a metaphysical outlook — incor- 
rectly describes people’s views about object 
identity. Although objects typically do not 
survive a leap from one basic-level category 
to another, it may not be impossible for them 
to do so. Blok, Newman, and Rips (in press) 
and Liittschwager (1995) gave participants 
scenarios that described novel transforma- 
tions that sometimes altered the basic-level 
category. In both studies, participants were 
more likely to agree that the transformed 
object was identical to the original if the 
transformational distance was small. How- 
ever, these judgments could not always be 
predicted by basic-level membership. 
Results from these sci-fi scenarios should 
be treated cautiously, but they suggest that 
people think individual objects have an in- 
tegrity that does not necessarily line up with 
their basic-level category. Although this idea 
may be flawed metaphysics, it is not unrea- 
sonable as psychometaphysics. People may 
think that individuals exist as the result of lo- 
cal causal forces — forces that are only loosely 
tethered to basic-level kinds. As long as these 
forces continue to support the individual's 
coherence, it can exist even if it finds itself in 
anew basic-level category. Of course, not all 
essentialists buy into this link between sor- 
talism and essentialism. For example, people 
might believe that an individual has both 
a category essence and a history and other 
characteristics that make it unique. Gutheil 
and Rosengren (1996) hypothesized that ob- 
jects have two difference essences, one for 
membership and another for identity. Just 
how individual identity and kind identity 
play out under these scenarios could then 


be highly variable. 


Domain Specificity 


The notion of domain specificity has served 
to organize a great deal of research on con- 
ceptual development. For example, much of 
the work on essentialism has been conducted 
in the context of exploring children’s naive 
biology (see also Au, 1994; Carey, 1995; 
Gopnik & Wellman, 1994; Spelke, Phillips, 
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domain may be guided by certain skeletal 
principles, constraints, and (possibly innate) 
assumptions about the world (see Gelman, 
2003; Gelman & Coley, 1990; Keil, 1981; 
Kellman & Spelke, 1983; Markman, 1990; 
Spelke, 1990). Carey’s (1985) influential 
book presented a view of knowledge acquisi- 
tion as built on framework theories that en- 
tail ontological commitments in the service 
of a causal understanding of real-world phe- 
nomena. Two domains can be distinguished 
from one another if they represent ontologi- 
cally distinct entities and sets of phenomena 
and are embedded within different causal 
explanatory frameworks. These ontological 
commitments serve to organize knowledge 
into domains such as naive physics (or me- 
chanics), naive psychology, or naive biology 
(e.g., see Au, 1994; Carey, 1995; Gelman 
& Koenig, 2001; Gopnik & Wellman, 1994; 
Hatano & Inagaki, 1994; Keil, 1994; Spelke 
et al., 1995; Wellman & Gelman, 1992). In 
the following, we focus on one candidate do- 
main, naive biology. 


FOLK BIOLOGY AND UNIVERSALS 


There is fairly strong evidence that all cul- 
tures partition local biodiversity into tax- 
onomies whose basic level is that of the 
“generic species” (Atran, 1990; Berlin et al., 
1973). Generic species often correspond to 
scientific species (e.g., elm, wolf robin); 
however, for the large majority of percep- 
tually salient organisms (see Hunn, 1999), 
such as vertebrates and flowering plants, a 
scientific genus frequently has only one lo- 
cally occurring species (e.g., bear, oak). In 
addition to the spontaneous division of local 
flora and fauna into generic species, cultures 
seem to structure biological kinds into hi- 
erarchically organized groups, such as white 
oak/oak/tree. Folk biological ranks vary lit- 
tle across cultures as a function of theo- 
ries or belief systems (see Malt, 1994, for a 
review). For example, in studies with Na- 
tive American and various U.S. and Low- 
land Maya groups, correlations between folk 
taxonomies and classical evolutionary tax- 
onomies of the local fauna and flora av- 
erage r=.75 at the generic species level 


(Atran, 1999; Bailenson et al., 2002; Medin 
et al., 2002). Much of the remaining vari- 
ance owes to obvious perceptual biases 
(Itza’ Maya group bats with birds in the 
same life form) and local ecological con- 
cerns. Contrary to received notions about 
the history and cross-cultural basis for folk 
biological classification, utility does not ap- 
pear to drive folk taxonomies (cf Berlin 
et al., 1973). 

These folk taxonomies also appear to 
guide and constrain reasoning. For exam- 
ple, Coley, Medin, and Atran (1997) found 
that both Itza’ Maya and U.S. undergradu- 
ates privilege the generic species level in in- 
ductive reasoning. That is, an inference from 
swamp white oak to all white oaks is little if 
any stronger than an inference from swamp 
white oak to all oaks. Above the level of 
oak, however, inductive confidence takes a 
sharp drop. In other words, people in both 
cultures treat the generic level (e.g., oak) as 
maximizing induction potential. The results 
for undergraduates are surprising because 
the original Rosch et al. (1976) basic-level 
studies had suggested that a more abstract 
level (e.g., TREE) acted as basic for under- 
graduates and should have been privileged 
in induction. That is, there is a discrep- 
ancy between results with undergraduates 
on basicness in naming, perceptual classifi- 
cation, and feature listing, on the one hand, 
and inductive inference, on the other hand. 
Coley et al. (1997) suggested that the rea- 
soning task relies on expectations associated 
with labeling rather than knowledge and that 
undergraduates may know very little about 
biological kinds (see also Wolff, Medin, & 
Pankratz, 1999). Medin and Atran (in press) 
cautioned against generalizing results on bi- 
ological thought from undergraduates be- 
cause most have relatively little first-hand 
experience with nature. 


INTERDOMAIN DIFFERENCES 


One of the most contested domain distinc- 
tions, and one that has generated much 
research, is that between psychology and bi- 
ology (e.g., Au & Romo, 1996, 1999; Carey, 
1991; Coley, 1995; Gelman, 2003; Hatano 
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agaki & Hatano, 1993, 1996; Johnson & 
Carey, 1998; Keil, 1995; Keil, Levin, Rich- 
man, G. Gutheil, 1999; Rosengren et al., 
1991; Springer & Keil, 1989, 1991). Carey 
(1985) argued that children initially under- 
stand biological concepts such as ANIMAL 
in terms of folk psychology, treating ani- 
mals as similar to people in having beliefs 
and desires. Others (e.g., Keil, 1989) argued 
that young children do have biologically spe- 
cific theories, albeit more impoverished than 
those of adults. For example, Springer and 
Keil (1989) showed that preschoolers think 
biological properties are more likely to be 
passed from parent to child than are so- 
cial or psychological properties. They ar- 
gued that this implies that the children have 
a biology-like inheritance theory. The evi- 
dence concerning this issue is complex. On 
the one hand, Solomon, Johnson, Zaitchik, 
and Carey (1996) claimed that preschoolers 
do not have a biological concept of inheri- 
tance because they do not have an adult’s 
understanding of the biological causal mech- 
anism involved. On the other hand, there 
is growing cross-cultural evidence that 4- 
to 5-year-old children believe (like adults) 
that the category membership of animals 
and plants follows that of their progeni- 
tors regardless of the environment in which 
the progeny matures (e.g., progeny of cows 
raised with pigs, acorns planted with apple 
seeds) (Atran et al., 2001; Gelman & Well- 
man, 1991; Sousa et al., 2002). Furthermore, 
it appears that Carey’s (1985) results on psy- 
chology versus biology may only hold for ur- 
ban children who have little intimate con- 
tact with nature (Atran, et al., 2001; Ross 
et al., 2003). Altogether, the evidence sug- 
gests that 4- to 5-year-old children do have a 
distinct biology, although perhaps one with- 
out a detailed model of causal mechanisms 
(see Rozenbilt & Keil, 2002, for evidence 
that adults also only have a superficial un- 
derstanding of mechanisms). 


DOMAINS AND BRAIN REGIONS 


Are these hypothesized domains associ- 
ated with dedicated brain structure? There 
is intriguing evidence concerning category- 


their ability to recognize and name category 
members in a particular domain of concepts. 
For example, Nelson (1946) reported a pa- 
tient who was unable to recognize a tele- 
phone, a hat, or a car but could identify 
people and other living things (the opposite 
pattern is also observed and is more com- 
mon). These deficits are consistent with the 
idea that anatomically and functionally dis- 
tinct systems represent living versus non- 
living things (Sartori & Job, 1988). An al- 
ternative claim (e.g., Warrington & Shallice, 
1984) is that these patterns of deficits are due 
to the fact that different kinds of informa- 
tion aid in categorizing different kinds of ob- 
jects. For example, perceptual information 
may be relatively more important for recog- 
nizing living kinds and functional informa- 
tion more important for recognizing artifacts 
(see Devlin et al., 1998; Farah & McClelland, 
1991, for computational implementations of 
these ideas). Although the weight of evi- 
dence appears to favor the kinds of informa- 
tion view (see Damasio et al., 1996; Forde 
& Humphreys, in press; Simmons & Barsa- 
lou, 2003), the issue continues to be debated 
(see Caramazza & Shelton, 1998, for a strong 
defense of the domain specificity view). 


DOMAINS AND MEMORY 


The issue of domain specificity returns us 
to one of earlier themes: Does memory 
organization depend on the meaning? We 
have seen that early research on semantic 
memory was problematic in this respect be- 
cause many of the findings that investigators 
used to support meaning-based organiza- 
tion had alternative explanations. General- 
purpose decision processes could produce 
the same pattern of results even if the in- 
formation they operated on was haphaz- 
ardly organized. Of course, in those olden 
days, semantic memory was supposed to 
be a hierarchically organized network like 
that in Figure 3.1; the network clustered 
concepts through shared superordinates and 
properties but was otherwise undifferenti- 
ated. Modularity and domain specificity of- 
fer a new take on semantic-based memory 
structure — a partition of memory space into 
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theories like these support memory organi- 
zation in a more adequate fashion than ho- 
mogeneous networks? 

One difficulty in merging domain speci- 
ficity with memory structure is that domain 
theories do not taxonomize categories — they 
taxonomize assumptions. What differenti- 
ates domains is the set of assumptions or 
warrants they make available for thinking 
and reasoning (see Toulmin, 1958, for one 
such theory), and this means that a par 
ticular category of objects usually falls in 
more than one domain. To put it another 
way, domain-specific theories are “stances” 
(Dennett, 1971) or “construals” (Keil, 1995) 
that overlap in their instances. Take the 
case of people. The naive psychology do- 
main treats people as having beliefs and goals 
that lend themselves to predictions about 
actions (e.g., Leslie, 1987; Wellman, 1990). 
The naive physics domain treats people as 
having properties such as mass and velocity 
that warrant predictions about support and 
motion (e.g., Clement, 1983; McCloskey, 
1983). The naive law school domain treats 
people as having properties, such as social 
rights and responsibilities, that lead to pre- 
dictions about obedience or deviance (e.g,, 
Fiddick, Cosmides, & Tooby, 2000). The 
naive biology domain (at least in the West- 
ern adult version) treats people as having 
properties such as growth and self-animation 
that lead to expectations about behavior and 
development. In short, each ordinary cate- 
gory may belong to many domains. 

If domains organize memory, then long- 
term memory will have to store a concept 
in each of the domains to which it is re- 
lated. Such an approach makes some of 
the difficulties of the old semantic memory 
more perplexing. Recall the issue of identify- 
ing the same concept across individuals (see 
“Concepts as Positions in Memory Struc- 
tures”). Memory modules have the same 
problem, but they add to it the dilemma of 
identifying concepts within individuals. How 
do you know that PEOPLE in your psychol- 
ogy module is the same concept as PEOPLE 
in your physics module and PEOPLE in your 
law school module? Similarity is out (be- 


the same way), spelling is out (both concepts 
might be tied to the word “people” in an in- 
ternal dictionary, but then fungi and metal 
forms are both tied to the word “mold”), 
and interconnections are out (because they 
would defeat the idea that memory is or- 
ganized by domain). We can not treat the 
multiple PEOPLE concepts as independent 
either because it is important to get back 
and forth between them. For example, the 
rights and responsibilities information about 
people in your law school module has to 
get together with the goals and desires in- 
formation about people in your psychology 
module in case you have to decide, together 
with your fellow jury members, whether the 
killing was a hate crime or was committed 
with malice aforethought. 

It is reasonable to think that background 
theories provide premises or grounds for in- 
ferences about different topics, and it is also 
reasonable to think that these theories have 
their “proprietary concepts.” However, if we 
take domain-specific modules as the basis 
for memory structure — as a new semantic 
memory — we also have to worry about non- 
proprietary concepts. We have argued that 
there must be such concepts because we 
can reason about the same thing with dif- 
ferent theories. Multiple storage is a possi- 
bility if you are willing to forego memory 
economy and parsimony and if you can solve 
the identifiability problem that we discussed 
in the previous paragraph. Otherwise, these 
domain-independent concepts have to in- 
habit a memory space of their own, and 
modules can not be the whole story. 


SUMMARY 


We seem to be arriving at a skeptical posi- 
tion with respect to the question of whether 
memory is semantically organized, but we 
need to be clear about what is and what is 
not in doubt. What we doubt is that there 
is compelling evidence that long-term mem- 
ory is structured in a way that mirrors lexical 
structure as in the original semantic mem- 
ory models. We do not doubt that mem- 
ory reflects meaningful relations among con- 
cepts, and it is extremely plausible that these 
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meanings. For example, there may well be 
a relation in memory that links the concept 
TRUCKER with the concept BEER, and the 
existence of this link is probably due in part 
to the meaning of “trucker” and “beer.” What 
is not so clear is whether memory structure 
directly reflects the sort of relations that, in 
linguistic theory, organizes the meaning of 
words (where, e.g., “trucker” and “beer” are 
probably not closely connected). We note, 
too, that we have not touched (and we do 
not take sides on) two related issues, which 
are themselves subjects of controversy. 

One of these residual issues is whether 
there is a split in memory between (1) gen- 
eral knowledge and (2) personally experi- 
enced information that is local to time and 
place. Semantic memory (Tulving, 1972) or 
generic memory (Hintzman, 1978) is some- 
times used as a synonym for general knowl- 
edge in this sense, and it is possible that 
memory is partitioned along the lines of this 
semantic/episodic difference, even though 
the semantic side is not organized by lexical 
content. The controversy in this case is how 
such a dual organization can handle learning 
of “semantic” information from “episodic” 
encounters (see Tulving, 1984, and his crit- 
ics in the same issue of Behavioral and Brain 
Sciences, for the ins and outs of this debate). 

The second issue that we are shirking is 
whether distributed brands of connection- 
ist models can provide a basis for meaning- 
based memory. One reason for shirking is 
that distributed organization means that 
concepts such as DAISY and CUP are not 
stored according to their lexical content. 
Instead, parts of the content of each con- 
cept are smeared across memory in over- 
lapping fashion. It is possible, however, that 
at a subconcept level — at the level of fea- 
tures or hidden units - memory has a se- 
mantic dimension, and we must leave this 
question open. 


Conclusions and Future Directions 


Part of our charge was to make some pro- 
jections about the future of research on 


attitude toward our predictions. However, 
there are several trends that we have identi- 
fied and, barring unforeseen circumstances 
(never a safe assumption), these trends 
should continue. One property our nomina- 
tions share is that they uniformly broaden 
the scope of research on concepts. Here’s our 
shortlist. 


Sensitivity to Multiple Functions 


The prototypical categorization experiment 
involves training undergraduates for about 
an hour and then giving transfer tests to as- 
sess what they have learned. This practice is 
becoming increasingly atypical, even among 
researchers studying artificially constructed 
categories in the lab. More recently, re- 
searchers have studied functions other than 
categorization, as well as interactions across 
functions. (See also Solomon et al., 1999.) 


Broader Applications of Empirical 
Generalizations and Computational 
Models 


As a wider range of conceptual functions 
comes under scrutiny, new generalizations 
emerge and computational models face new 
challenges (e.g., Yamauchi et al., 2002). Both 
developments set the stage for better bridg- 
ing to other contexts and applications. This is 
perhaps most evident in the area of cognitive 
neuroscience, where computational models 
have enriched studies of multiple categoriza- 
tion and memory systems (and vice versa). 
Norman, Brooks, Coblenz, and Babcock 
(1992) provided a nice example of exten- 
sions from laboratory studies to medical di- 
agnosis in the domain of dermatology. 


Greater Interactions between Work on 
Concepts and Psycholinguistic Research 


We have pressed the point that research on 
concepts has diverged from psycholinguis- 
tics because two different concepts of con- 
cepts seem to be in play in these fields. How- 
ever, it cannot be true that the concepts 
we use in online sentence understanding 
are unrelated to the concepts we employ in 
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portunity for theorists and experimenters 
here to provide an account of the interface 
between these functions. One possibility, for 
example, is to use sentence comprehension 
techniques to track the way that the lexical 
content of a word in speech or text is trans- 
formed in deeper processing (see Pinango, 
Zurif, & Jackendoff, 1999, for one effort in 
this direction). Another type of effort at in- 
tegration is Wolff and Song’s (2003) work 
on causal verbs and people’s perception of 
cause in which they contrast predictions de- 
rived from cognitive linguistics with those 
from cognitive psychology. 


Greater Diversity of Participant 
Populations 


Although research with U.S. undergradu- 
ates at major universities will probably never 
go out of style (precedent and convenience 
are two powerful staying forces), we expect 
the recent increase to continue in the use 
of other populations. Work by Nisbett and 
his associates (e.g., Nisbett & Norenzayan, 
2002; Nisbett, Peng, Choi, & Norenzayan, 
2001) has called into question the idea that 
basic cognitive processes are universal, and 
categories and conceptual functions are ba- 
sic cognitive functions. In much of the work 
by Atran, Medin, and their associates, un- 
dergraduates are the “odd group out” in the 
sense that their results deviate from those 
of other groups. In addition, cross-linguistic 
studies are often an effective research tool 
for addressing questions about the relation- 
ship between linguistic and conceptual de- 
velopment (e.g., Waxman, 1999). 


More Psychometaphysics 


An early critique of the theory theory is that 
it suffered from vagueness and imprecision. 
As we have seen in this review, however, this 
framework has led to more specific claims 
(e.g., Ahn’s causal status hypothesis) and the 
positions are clear enough to generate the- 
oretical controversies (e.g., contrast Smith, 
Jones, & Landau, 1996 with Gelman, 2000, 
and Booth & Waxman, 2002, in press, with 
Smith, Jones, Yoshida, & Colunga, 2003). It 


in these questions. 


All of the Above in Combination 


Concepts and categories are shared by all the 
cognitive sciences, and so there is very little 
room for researchers to stake out a single 
paradigm or subtopic and work in blissful 
isolation. Although the idea of a seman- 
tic memory uniting memory structure, 
lexical organization, and categorization 
may have been illusory, this does not 
mean that progress is possible by ig- 
noring the insights on concepts that 
these perspectives (and others) pro- 
vide. We may see further fragmentation 
in the concepts of concepts, but it will still 
be necessary to explore the relations among 
them. Our only firm prediction is that the 
work we will find most exciting will be re- 
search that draws on multiple points of view. 
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CHAPTER 4 


Approaches to Modeling Human Mental 
Representations: What Works, What 
Doesn’t, and Why 


Leonidas A. A. Doumas 
John E. Hummel 


Relational Thinking 


A fundamental aspect of human intelligence 
is the ability to acquire and manipulate 
relational concepts. Examples of relational 
thinking include our ability to appreciate 
analogies between seemingly different ob- 
jects or events (e.g., Gentner, 1983; Gick 
& Holyoak, 1980, 1983; Holyoak & Tha- 
gard, 1995; see Holyoak, Chap. 6), our abil- 
ity to apply abstract rules in novel situations 
(e.g., Smith, Langston, & Nisbett, 1992), 
our ability to understand and learn language 
(e.g., Kim, Pinker, Prince, & Prasada, 1991), 
and even our ability to appreciate percep- 
tual similarities (e.g., Goldstone, Medin, & 
Gentner, 1991; Hummel, 2000; Hummel & 
Stankiewicz, 1996; Palmer, 1978; see Gold- 
stone & Son, Chap. 2). Relational thinking 
is ubiquitous in human cognition, under- 
lying everything from the mundane (e.g., 
the thought “the mug is on the desk”) to 
the sublime (e.g., Cantor’s use of set the- 
ory to prove that the cardinal number of the 
reals is greater than the cardinal number of 
the integers). 


Relational thinking is so commonplace 
that it is easy to assume the psychologi- 
cal mechanisms underlying it are relatively 
simple. They are not. The capacity to form 
and manipulate relational representations 
appears to be a late evolutionary develop- 
ment (Robin & Holyoak, 1995) closely tied 
to the increase in the size and complexity 
of the frontal cortex in the brains of higher 
primates, especially humans (Stuss & Ben- 
son, 1986). Relational thinking also devel- 
ops relatively late in childhood (see, eg., 
Smith, 1989; Halford, Chap. 22). Along with 
language, the human capacity for relational 
thinking is the major factor distinguish- 
ing human cognition from the cognitive 
abilities of other animals (for reviews, see 
Holyoak & Thagard, 1995; Oden, Thomp- 
son, & Premack, 2001; Call & Tomasello, 


Chap. 25). 


Relational Representations 


Central to understanding human relational 
thinking is understanding the nature of the 
mental representations underlying it: How 
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as “if every element of set A is paired with 
a distinct element of set B, and there are 
still elements of B left over, then the car 
dinal number of B is greater than the cardi- 
nal number of A,” or even simple relations 
such as “John loves Mary” or “the mag- 
azine is next to the phone”? Two prop- 
erties of human relational representations 
jointly make this apparently simple question 
surprisingly difficult to answer (Hummel 
& Holyoak, 1997): As elaborated in the 
next sections, human relational representa- 
tions are both symbolic and semantically rich. 
Although these properties are straightfor- 
ward to account for in isolation, account- 
ing for both together has proven much 
more challenging. 


RELATIONAL REPRESENTATIONS ARE SYMBOLIC 


A symbolic representation is one that rep- 
resents relations explicitly and specifies the 
arguments to which they are bound. Rep- 
resenting relations explicitly means having 
primitives (i.e., symbols, nodes in a network, 
neurons) that correspond specifically to rela- 
tions and/or relational roles. This definition 
of “explicit,” which we take to be uncontro- 
versial (see also Halford et al., 1998; Holland 
et al., 1986; Newell, 1990), implies that rela- 
tions are represented independently of their 
arguments (Hummel & Biederman, 1992; 
Hummel & Holyoak, 1997, 2003 a). That is, 
the representation of a relation cannot vary 
as a function of the arguments it happens to 
take at a given time, and the representation 
of an argument cannot vary across relations 
or relational roles. 

Some well-known formal representa- 
tional systems that meet this requirement in- 
clude propositional notation, labeled graphs, 
mathematical notation, and computer pro- 
gramming languages (among many others). 
For example, the relation murders is repre- 
sented in the same way (and means the same 
thing) in the proposition murders (Bill, Su- 
san) as it is in the proposition murders (Sally, 
Robert), even though it takes different ar- 
guments across the two expressions. Like- 
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wise, “2” means the same thing in x? as in 2%, 


pressions. At the same time, relational repre- 
sentations explicitly specify how arguments 
are bound to relational roles. The relation 
“murders (Bill, Susan)” differs from “murders 
(Susan, Bill)” only in the binding of argu- 
ments to relational roles, yet the two expres- 
sions mean very different things (especially 
to Susan and Bill). 

The claim that formal representational 
systems (e.g., propositional notation, mathe- 
matical notation) are symbolic is completely 
uncontroversial. In contrast, the claim that 
human mental representations are symbolic 
is highly controversial (for reviews, see 
Halford et al., 1998; Hummel & Holyoak, 
1997, 2003a; Marcus, 1998, 2001). The best- 
known argument for the role of symbolic 
representations in human cognition — the ar- 
gument from systematicity — was made by 
Fodor and Pylyshyn (1988). They observed 
that knowledge is systematic in the sense 
that the ability to think certain thoughts 
seems to imply the ability to think related 
thoughts. For example, a person who un- 
derstands the concepts “John,” “Mary,” and 
“loves,” and can understand the statement 
“John loves Mary,” must surely be able to 
understand “Mary loves John.” This prop- 
erty of systematicity, they argued, demon- 
strates that human mental representations 
are symbolic. Fodor and Pylyshyn’s argu- 
ments elicited numerous responses from 
the connectionist community claiming to 
achieve or approximate systematicity in 
nonsymbolic (eg., traditional connection- 
ist) architectures (for a recent example, see 
Edelman & Intrator, 2003). At the same 
time, however, Fodor and Pylyshyn’s defi- 
nition of “systematicity” is so vague that it 
is difficult or impossible to evaluate these 
claims of “systematicity achieved or approx- 
imated” (van Gelder & Niklasson, 1994; for 
an example of the kind of confusion that 
has resulted from the attempt to approxi- 
mate systematicity, see Edelman & Intrator, 
2003, and the reply by Hummel, 2003). The 
concept of “systematicity” has arguably done 
more to cloud the debate over the role of 
symbolic representations in human cogni- 
tion than to clarify it. 
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fine symbolic competence is in terms of the 
ability to appreciate what different bind- 
ings of the same relational roles and fillers 
have in common and how they differ (see 
also Garner, 1974; Hummel, 2000; Hummel 
& Holyoak, 1997, 2003; Saiki & Hummel, 
1998). Under this definition, what matters 
is the ability to appreciate what “John loves 
Mary” has in common with “Mary loves 
John” (i.e, the same relations and argu- 
ments are involved) and how they differ (i.e., 
the role-filler bindings are reversed). It does 
not strictly matter whether you can “under 
stand” the statements, or even whether they 
make any sense. What matters is that you 
can evaluate them in terms of the relations 
among their components. This same ability 
allows you to appreciate how “the glimby 
jolls the ronket” is similar to and differ- 
ent from “the ronket jolls the glimby,” even 
though neither statement inspires much by 
way of understanding. To gain a better ap- 
preciation of the abstractness of this ability, 
note that the ronket and glimby may not 
even be organisms (as we suspect most read- 
ers initially assume they are) but may instead 
be machine parts, mathematical functions, 
plays in astrategy game, or anything else that 
can be named. 

This definition of symbolic competence 
admits to more objective evaluation than 
does systematicity: one can empirically eval- 
uate, for any f x, and y, whether someone 
knows what f (x, y) has in common with and 
how it differs from f(y, x). Itis also important 
because it relates directly to what we take to 
be the defining property of a symbolic (i.e, 
explicitly relational) representation: namely, 
as noted previously, the ability to represent 
relational roles independently of their argu- 
ments and to simultaneously specify which 
roles are bound to which arguments (see also 
Hummel, 2000, 2003; Hummel & Holyoak, 
1997, 20034). It is the independence of roles 
and fillers that allows one to appreciate that 
the glimby in “the glimby jolls the ronket” is 
the same thing as the glimby in “the ron- 
ket jolls the glimby”; and it is the ability 
to explicitly bind arguments to relational 
roles that allows one to know how the two 


ity to appreciate these similarities and differ- 
ences as strong evidence that the represen- 
tations underlying human relational thinking 
are symbolic. 


RELATIONAL REPRESENTATIONS ARE 
SEMANTICALLY RICH 

The second fundamental property of hu- 
man relational representations, and human 
mental representations more broadly, is that 
they are semantically rich. It means some- 
thing to be a lover or a murderer, and the 
human mental representation of these rela- 
tions makes this meaning explicit. As a re- 
sult, there is an intuitive sense in which loves 
(John, Mary) is more like likes (John, Mary) 
than murders (John, Mary). Moreover, the 
meanings of various relations seem to ap- 
ply specifically to individual relational roles 
rather than to relations as indivisible wholes. 
For example, it is easy to appreciate that the 
agent (i.e, killer) role of murders (x, y) is 
similar to the agent role of attempted-murder 
(x, y) even though the patient roles dif- 
fer (i.e., the patient is dead in the former 
case but not the latter), and the patient role 
of murder (x, y) is like the patient role of 
manslaughter (x, y) even though the agent 
roles differ (i.e., the act is intentional in the 
former case but not the latter). 

The semantic richness of human rela- 
tional representations is also evidenced by 
their flexibility (Hummel & Holyoak, 1997). 
Given statements such as taller-than (Abe, 
Bill), tall (Charles), and short (Dave), it is 
easy to map Abe onto Charles and Bill onto 
Dave even though doing so requires the rea- 
soner to violate the “n-ary restriction” (i.e., 
mapping the argument(s) and role(s) of an 
n-place predicate onto those of an m-place 
predicate, where m 4 n). Given shorter-than 
(Eric, Fred), it is also easy to map Eric onto 
Bill (and Dave) and Fred onto Abe (and 
Charles). These mappings are based on the 
semantics of individual roles, rather than, for 
instance, the fact that taller-than and shorter- 
than are logical opposites: The relation loves 
(x, y) is in some sense the opposite of hates 
(x, y) [or if you prefer, not-loves (x, y)] but 
in contrast to taller-than and shorter-than in 
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the second role of the other, the first role 
of loves (x, y) maps to the first role of hates 
(x, y) [or not-loves (x, y)]. The point is that 
the similarity and/or mappings of various re- 
lational roles are idiosyncratic and based not 
on the formal syntax of propositional nota- 
tion, but on the semantic content of the indi- 
vidual roles in question. The semantics of re- 
lational roles matter and are an explicit part 
of the mental representation of relations. 

The semantic properties of relational 
roles manifest themselves in numerous other 
ways in human cognition. For example, they 
influence both memory retrieval (e.g., Gen- 
tner, Ratterman, & Forbus, 1993; Ross, 1987; 
Wharton, Holyoak, & Lange, 1996) and our 
ability to discover structurally appropriate 
analogical mappings (Bassok, Wu, & Olseth, 
1995; Krawczyk, Holyoak, & Hummel, in 
press; Kubose, Holyoak, & Hummel, 2002; 
Ross, 1987). They also influence which in- 
ferences seem plausible from a given collec- 
tion of stated facts. For instance, upon learn- 
ing about a culture in which nephews tra- 
ditionally give their aunts a gift on a par- 
ticular day of the year, it is a reasonable 
conjecture that there may also be a day on 
which nieces in this culture give their uncles 
gifts. This inference is based on the seman- 
tic similarity of aunts to uncles and nieces 
to nephews, and on the semantics of gift 
giving, not the syntactic properties of the 
give-gift relation. 

In summary, human mental representa- 
tions are both symbolic (i.e., they explic- 
itly represent relations and the bindings of 
relational roles to their fillers) and seman- 
tically rich (in the sense that they make 
the semantic content of individual relational 
roles and their fillers explicit). A complete 
account of human thinking must elucidate 
how each of these properties can be achieved 
and how they work together. An account 
that achieves one property at the expense of 
the other is at best only a partial account of 
human thinking. The next section reviews 
the dominant approaches to modeling hu- 
man mental representations, with an em- 
phasis on how each approach succeeds or 
fails to capture these two properties of hu- 


ditional symbolic approaches to mental rep- 
resentation, traditional distributed connec- 
tionist approaches, conjunctive distributed 
connectionist approaches (based on tensor 
products and their relatives), and an ap- 
proach based on dynamic binding of dis- 
tributed and localist connectionist represen- 
tations into symbolic structures. 


Approaches to Modeling Human 
Mental Representation 


Symbol-Argument-Argument Notation 


The dominant approach to modeling rela- 
tional representations in the computational 
literature is based on propositional notation 
and formally equivalent systems (including 
varieties of labeled graphs and high-rank ten- 
sor representations). These representational 
systems — which we refer to collectively 
as symbol-argument-argument notation, 
or “SAA” — borrow conventions directly 
from propositional calculus and are com- 
monly used in symbolic models based on 
production systems (see Lovett & Anderson, 
Chap. 17, fora review), many forms of graph 
matching (e.g., Falkenhainer et al., 1989; 
Keane et al., 1994) and related algorithms. 
SAA represents relations and their argu- 
ments as explicit symbols and represents the 
bindings of arguments to relational roles in 
terms of the locations of the arguments in 
the relational expression. For example, in the 
proposition loves (John, Mary), John is 
bound to the lover role by virtue of appear- 
ing in the first slot after the open paren- 
thesis, and Mary to the beloved by virtue of 
appearing in the second slot. Similarly, in a 
labeled graph the top node (of the local sub- 
graph coding “John loves Mary”) represents 
the loves relation, and the nodes directly be- 
low it represent its arguments with the bind- 
ings of arguments to roles captured, for ex- 
ample, by the order (left to right) in which 
those arguments are listed. These schemes, 
which may look different at first pass, are in 
fact isomorphic. In both cases, the relation 
is represented by a single symbol, and the 
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captured by the syntax of the notation (as list 
position within parentheses, as the locations 
of nodes in a directed graph, etc.). 

Models based on SAA are meaningfully 
symbolic in the sense described previously: 
They represent relations explicitly (i.e, in- 
dependently of their arguments), and they 
explicitly specify the bindings of relational 
roles to their arguments. This fact is no sur- 
prise, given that SAA is based on represen- 
tational conventions that were explicitly de- 
signed to meet these criteria. However, the 
symbolic nature of SAA is nontrivial because 
it endows models based on SAA with all 
the advantages of symbolic representations. 
Most important, symbolic representations 
enable relational generalization — generaliza- 
tions that are constrained by the relational 
roles that objects play, rather than simply 
the features of the objects themselves (see 
Holland et al., 1986; Holyoak & Thagard, 
1995; Hummel & Holyoak, 1997, 20032; 
Thompson & Oden, 2000). Relational gener- 
alization is important because, among other 
things, it makes it possible to define, match, 
and apply variablized rules. (It also makes 
it possible to make and use analogies, to 
learn and use schemas, and ultimately to 
learn variablized rules from examples; see 
Hummel & Holyoak, 2003a.) For example, 
with a symbolic representational system, it 
is possible to define the rule “if loves (x, y) 
and loves (y, z) and not [loves (y, x)], then 
jealous (x, z)” and apply that rule to any 
x, y, and z that match its left-hand (“if”) 
side. As elaborated shortly, this important 
capacity, which plays an essential role in hu- 
man relational thinking, lies fundamentally 
beyond the reach of models based on non- 
symbolic representations (Holyoak & Hum- 
mel, 2000; Hummel & Holyoak, 2003 ; 
Marcus, 1998). 

Given the symbolic nature of SAA, it is no 
surprise that it has figured so prominently in 
models of relational thinking and symbolic 
cognition more generally (see Lovett & An- 
derson, Chap. 17). Less salient are the limi- 
tations of SAA. It has been known for a long 
time that SAA and related representational 
schemes have difficulty capturing shades of 


semantic content. This limitation was a cen- 
tral focus of the influential critiques of sym- 
bolic modeling presented by the connection- 
ists in the mid-198o0s (e.g., Rumelhart et al., 
1986). A review of how traditional symbolic 
models have handled this problem (typi- 
cally with external representational systems 
such as lookup tables or matrices of hand- 
coded “similarity” values between symbols; 
see Lovett & Anderson, Chap. 17) also re- 
veals that the question of semantics in SAA 
is, in the very least, a thorny inconvenience 
(Hummel & Holyoak, 1997). However, at 
the same time, it is tempting to assume it is 
merely an inconvenience — that surely there 
exists a relatively straightforward way to add 
semantic coding to propositional notation 
and other forms of SAA and that a solu- 
tion will be found once it becomes impor- 
tant enough for someone to pay attention to 
it. In the mean time, it is surely no reason to 
abandon SAA as a basis for modeling human 
cognition. 

However, it turns out that it is more than 
a thorny inconvenience: As demonstrated 
by Doumas and Hummel (2004), it is logi- 
cally impossible to specify the semantic con- 
tent of relational roles within an SAA rep- 
resentation. In brief, SAA representations 
cannot represent relational roles explicitly 
and simultaneously specify how they come 
together to form complete relations. The 
reason for this limitation is that SAA repre- 
sentations specify role information only im- 
plicitly (see Halford et al., 1998). Specify- 
ing this information explicitly requires new 
propositions, which must be related to the 
original relational representation via a sec- 
ond relation. In SAA, this results in a new 
relational proposition, which itself implies 
role representations to which it must be re- 
lated by a third relational proposition, and 
so forth, ad infinitum. In short, attempt- 
ing to use SAA to link relational roles to 
their parent relations necessarily results in 
an infinite regress of nested “constituent of” 
relations specifying which roles belong to 
which relations/roles (see Doumas & Hum- 
mel, 2004 for the full argument). Asa result, 
attempting to use SAA to specify how roles 
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system ill-typed (i.e., inconsistent and/or 
paradoxical; see, e.g., Manzano, 1996). 

The result of this limitation is that SAA 
systems are forced to use external (i-e., non- 
SAA) structures to represent the meaning of 
symbols (or to approximate those meanings, 
e.g., with matrices of similarity values) and 
external control systems (which themselves 
cannot be based on SAA) to read the SAA, 
access the external structures, and relate the 
two. Thus, it is no surprise that SAA-based 
models rely on lookup tables, similarity ma- 
trices and so forth to specify how different 
relations and objects are semantically related 
to one another: It is not merely a conve- 
nience; it is a necessity. 

This property of SAA sharply limits its 
utility as a general approach to modeling 
human mental representations. In particu- 
lar, it means that the connectionist critiques 
of the mid-ig980s were right: Not only do 
traditional symbolic representations fail to 
represent the semantic content of the ideas 
they mean to express, but the SAA represen- 
tations on which they are based cannot even 
be adapted to do so. The result is that SAA 
is ill equipped, in principle, to address those 
aspects of human cognition that depend on 
the semantic content of relational roles and 
the arguments that fill them (which, as sum- 
marized previously, amounts to a substan- 
tial proportion of human cognition). This 
fact does not mean that models based on 
SAA (i.e., traditional symbolic models) are 
“wrong” but only that they are incomplete. 
SAA is at best only a shorthand (a very 
short hand) approximation of human mental 
representations. 


Traditional Connectionist 
Representations 


In response to limitations of traditional sym- 
bolic models, proponents of connectionist 
models of cognition (see, e.g., Elman et al., 
1996; Rumelhart et al., 1986; St. John & Mc- 
Clelland, 1990; among many others) have 
proposed that knowledge is represented not 
as discrete symbols that enter into symbolic 
expressions but as patterns of activation 
distributed over many processing elements. 


sense that (1) any single concept is repre- 
sented as a pattern (i.e., vector) of activa- 
tion over many elements (“nodes” or “units” 
that are typically assumed to correspond 
roughly to neurons or small collections of 
neurons), and (2) any single element will par- 
ticipate in the representation of many differ- 
ent concepts.” Asa result, two patterns of ac- 
tivation will tend to be similar to the extent 
that they represent similar concepts: In con- 
trast to SAA, distributed connectionist rep- 
resentations provide a natural basis for rep- 
resenting the semantic content of concepts. 
Similar ideas have been proposed in the con- 
text of latent semantic analysis (Landauer 
& Dumais, 1997) and related mathemati- 
cal techniques for deriving similarity metrics 
from the co-occurrence statistics of words in 
passages of text (e.g., Lund & Burgess, 1996). 
In all these cases, concepts are represented 
as vectors, and vector similarity is taken as 
an index of the similarity of the corres- 
ponding concepts. 

Because distributed activation vectors 
provide a natural basis for capturing the sim- 
ilarity structure of a collection of concepts 
(see Goldstone & Son, Chap. 2), connection- 
ist models have enjoyed substantial success 
simulating various kinds of learning and gen- 
eralization (see Munakata & O'Reilly, 2003): 
Having been trained to give a particular 
output (e.g., generate a specific activation 
vector on a collection of output units) in 
response to a given input (i.e., vector of 
activations on a collection of input units), 
connectionist networks tend to generalize 
automatically (i.e., activate an appropriate 
output vector, or a close approximation of 
it) in response to new inputs that are similar 
to trained inputs. In a sense, connectionist 
representations are much more flexible than 
symbolic representations based on varieties 
of SAA. Whereas models based on SAA re- 
quire predicates to match exactly in order to 
treat them identically,? connectionist mod- 
els generalize more gracefully based on the 
degree of overlap between trained patterns 
and new ones. 

In another sense, however, connectionist 
models are substantially less flexible than 
symbolic models. The reason is that the 
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tional connectionist models are not sym- 
bolic in the sense defined previously. That is, 
they cannot represent relational roles inde- 
pendently of their fillers and simultaneously 
specify which roles are bound to which fillers 
(Hummel & Holyoak, 1997, 2003 a). Instead, 
a network’s knowledge is represented as sim- 
ple vectors of activation. Under this ap- 
proach, relational roles (to the extent that 
they are represented at all) are either repre- 
sented on separate units from their potential 
fillers (e.g., with one set of units for the lover 
role of the loves relation, another set for the 
beloved role, a third set for John, a fourth 
set for Mary, etc.), in which case the bind- 
ings of roles to their fillers is left unspecified 
(i.e., simply activating all four sets of units 
cannot distinguish “John loves Mary” from 
“Mary loves John” or even from a statement 
about a narcissistic hermaphrodite); or else 
units are dedicated to specific role-filler con- 
junctions (e.g., with one set of units for “John 
as lover” another for “John as beloved”, etc.; 
e.g., Hinton, 1990), in which case the bind- 
ings are specified, but only at the expense of 
role-filler independence (e.g., nothing rep- 
resents the lover or beloved roles, indepen- 
dently of the argument to which they hap- 
pen to be bound). In neither case are the 
resulting representations truly symbolic. 
Indeed, some proponents of traditional 
connectionist models (e.g., Elman et al., 
1996) — dubbed “eliminative connectionists” 
by Pinker and Prince (1988; see also Marcus, 
1998) for their explicit desire to eliminate 
the need for symbolic representations from 
models of cognition — are quite explicit in 
their rejection of symbolic representations as 
a component of human cognition. Instead of 
representing and matching symbolic “rules,” 
eliminative (i.e., traditional) connectionist 
models operate by learning to associate vec- 
tors of features (where the features corre- 
spond to individual nodes in the network). 
As a result, they are restricted to generaliz- 
ing based on the shared features in the train- 
ing set and the generalization set. Although 
the generalization capabilities of these net- 
works often appear quite impressive at first 
blush (especially if the training set is judi- 
ciously chosen to span the space of all possi- 


2001), the resulting models are not capable 
of relational generalization (see Hummel & 
Holyoak, 1997, 2003a; Marcus, 1998, 2001, 
for detailed discussions of this point). 

A particularly clear example of the im- 
plications of this limitation comes from the 
story Gestalt model of story comprehension 
developed by St. John (1992; St. John & 
McClelland, 1990). In one computational 
experiment (St. John, 1992, simulation 1), 
the model was first trained with 1,000,000 
short texts consisting of statements based on 
136 constituent concepts. Each story instan- 
tiated a script such as “<person> decided to 
go to <destination>; <person> drove <ve- 
hicle> to <destination>” (e.g., “George de- 
cided to go to a restaurant; George drove a 
Jeep to the restaurant”; “Harry decided to 
go to the beach; Harry drove a Mercedes to 
the beach”). 

After the model had learned a network 
of associative connections based on the 
1,000,000 examples, St. John tested its abil- 
ity to generalize by presenting it with a text 
containing a new statement, such as “John 
decided to go to the airport.” Although the 
statement as a whole was new, it referred 
to people, objects and places that had ap- 
peared in the examples used for training. St. 
John reported that when given a new exam- 
ple about deciding to go to the airport, the 
model would typically activate the restau- 
rant or the beach (i.e., the destinations in 
prior examples of the same script) as the 
destination, rather than making the contex- 
tually appropriate inference that the per- 
son would drive to the airport. This type 
of error, which would appear quite unnat- 
ural in human comprehension, results from 
the model's inability to generalize relation- 
ally (e.g., if a person wants to go location x, 
then x will be the person’s destination — a 
problem that requires the system to repre- 
sent the variable x and its value, indepen- 
dently of its binding to the role of desired 
location or destination). As St. John noted, 
“Developing a representation to handle role 
binding proved to be difficult for the model” 
(1992, Pp. 294). 

In general, although an eliminative con- 
nectionist model can make “inferences” on 
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the model will remember panealur associa- 
tions that have been strengthened by learn- 
ing), the acquired knowledge may not gen- 
eralize at all to novel instantiations that 
lie outside the training set (Marcus, 1998, 
2001). For example, having learned that Al- 
ice loved Sam, Sam loved Betty, and Al- 
ice was jealous of Betty, and told that John 
loves Mary and Mary loves George, a per- 
son is likely to conjecture that John is likely 
to be jealous of George. An eliminative 
connectionist system would be a complete 
loss to make any inferences: John, Mary, 
and George are different people than Al- 
ice, Sam, and Betty (Holyoak & Hummel, 
2000; Hummel & Holyoak, 2003; Phillips & 
Halford, 1997). 

A particularly simple example that re- 
veals such generalization failures is the iden- 
tity function (Marcus, 1998). Suppose, for 
example, that a human reasoner was trained 
to respond with “1” to “1,” “2” to “2,” and “3” 
to “3.” Even with just these lies ea 
the human is almost certain to respond with 
“4” to “4,” without any direct feedback that 
this is the correct output for the new case. In 
contrast, an eliminative connectionist model 
will be unable to make this obvious general- 
ization. Such a model can be trained to give 
specific outputs to specific inputs (e.g., as il- 
lustrated in Figure 4.1). But when training 
is over, it will have learned only the input- 
output mappings on which it was trained 
(and perhaps those that can be represented 
by interpolating between trained examples; 
see Marcus, 1998): Because the model lacks 
the capacity to represent variables, extrap- 
olation outside the training set is impossi- 
ble. In other words, the model will simply 
have learned to associate “1” with “1, 
with “2,” and “3” with “3.” A human, by con- 
trast, will have learned to associate input (x) 
with output (x), for any x; and doing so re- 
quires the capacity to bind any new number 
(whether it was in the training space or not) 
to the variable x. Indeed, most people are 
willing to generalize even beyond the world 
of numbers. We leave it to the reader to give 
the appropriate outputs in response to the 
following inputs: “A”; “B”; “flower.” 


yo” 
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tionist model illustrated in Figure 4.1 fails 
to learn the identity function is that it vio- 
lates variable/value (i.e., role/filler) indepen- 
dence. The input and output units in Figure 
4.1 are intentionally mislabeled to suggest 
that they represent the concepts “1,” “2,” and 
so on. However, in fact, they do not repre- 
sent these concepts at all. Instead, the unit 
labeled “1” in the input layer represents not 
9,” but “1 as the input to the identity function.” 
That is, it represents a conjunctive binding 
of the value “1” to the variable “input to 
the function.” Likewise, the unit labeled “1” 
in the output layer represents, not “1,” but 
“7” as output of the identity function. Thus, 
counter to initial appearances, the concept 
“7” is not represented anywhere in the net- 
work. Neither, for that matter, is the concept 
“input to the identity function”: Every unit in 
the input layer represents some specific input 
to the function; there are no units to repre- 
sent input as a generic unbound variable. 
Because of this representational con- 
vention (i.e, representing variable-value 
conjunctions instead of variables and val- 
ues), traditional connectionist networks are 
forced to learn the identity function as a 
mapping from one set of conjunctive units 
(the input layer) to another set of conjunc- 
tive units (the output layer). This mapping, 
which to our eye resembles an approxima- 
tion of the identity function, f(x) = x, is, to 
the network, just an arbitrary mapping. It 
is arbitrary precisely because the unit repre- 
senting “1 as output of the function” bears 
no relation to the unit representing “1 as in- 
put to the function.” Although any func- 
tion specifies a mapping [e.g., a mapping 
from values of x to values of f(x)], learning a 
mapping is not the same thing as learning a 
function. Among other differences, a func- 
tion can be universally quantified [e.g., Vx, 
f(x) =x], whereas a finite mapping cannot; 
universal quantification permits the function 
to apply to numbers (and even nonnum- 
bers) that lie well outside the “training” set. 
The point is that the connectionist model’s 
failure to represent variables independently 
of their values (and vice versa) relegates it 
to (at best) approximating a subset of the 
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Figure 4.1. Diagram of a two-layer 
connectionist network for solving the identity 
function in which the first three units (those 
representing the numbers 1, 2, and 3) have been 
trained and the last two (those representing the 
numbers 4 and 5) have not. Black lines indicate 
already trained connections, whereas grey lines 
denote untrained connections. Thicker lines 
indicate highly excitatory connections, whereas 
thinner lines signify slightly excitatory or slightly 
inhibitory connections. 


identity function as a simple, and ultimately 
arbitrary, mapping (see Marcus, 1998). Peo- 
ple, by contrast, represent variables indepen- 
dently of their values (and vice versa) and 
so can recognize and exploit the decidedly 
nonarbitrary relation between the function’s 
inputs and its outputs: To us, but not to 
the network, the function is not an arbitrary 
mapping at all, but rather a trivial game of 
“say what I say.” 

As these examples illustrate, the power of 
human reasoning and learning, most notably 
our capacity for sophisticated relational gen- 
eralizations, is dependent on the capacity 
to represent relational roles (variables) and 
bind them to fillers (values). This is precisely 
the same capacity that permits composition 
of complex symbols from simpler ones. The 


system; hence, any model that succeeds in 
eliminating symbol systems will ipso facto 
have succeeded in eliminating itself from 
contention as a model of the human cog- 
nitive architecture. 


Conjunctive Connectionist 
Representations 


Some modelers, recognizing both the es- 
sential role of relational representations in 
human cognition (e.g., for relational gen- 
eralization) and the value of distributed 
representations, have sought to construct 
symbolic representations in connectionist ar 
chitectures. The most common approach is 
based on Smolensky’s (1990) tensor prod- 
ucts (eg., Halford et al., 1998) and its 
relatives, such as spatter codes (Kanerva, 
1998), holographic reduced representations 
(HRRs; Plate, 1994), and circular convo- 
lutions (Metcalfe, 1990). We restrict our 
discussion to tensor products because the 
properties of tensors we discuss also apply 
to the other approaches (see Holyoak & 
Hummel, 2000). 

A tensor product is an outer product of 
two or more vectors that are treated as an 
activation vector (i.e., rather than a matrix) 
for the purposes of knowledge representa- 
tion (see Smolensky, 1990). In the case of a 
rank 2 tensor, uv, formed from two vectors, 
u and y, the activation of the ijth element 
of uv is simply the product of the activa- 
tions of the ith and jth elements of u and v, 
respectively: uv; = u;v;. Similarly, the ijkth 
value of the rank 3 tensor uvw is the product 
uvwj, = u;vjwe, and so forth, for any num- 
ber of vectors (i.e., for any rank). 

Tensors and their relatives can be used 
to represent role-filler bindings. For exam- 
ple, if the loves relation is represented by the 
vector u, John by the vector v, and Mary 
by the vector w, then the proposition loves 
(John, Mary) could be represented by the 
tensor uvw; loves (Mary, John) would be rep- 
resented by the tensor uwv. This procedure 
for representing propositions as tensors — in 
which the predicate is represented by one 
vector (here, u) and its argument(s) by the 
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(Halford et al., 1998): One entity (here, a 
vector) represents the relation, other enti- 
ties represent its arguments, and the bind- 
ings of arguments to roles of the relation 
are represented spatially (note the differ- 
ence between uvw and uwv). This version 
of tensor-based coding is SAA-isomorphic; 
the entire relation is represented by a single 
vector or symbol, and arguments are bound 
directly to that symbol. Consequently, it 
provides no basis for differentiating the se- 
mantic features of the various roles of a 
relation. 

Another way to represent relational bind- 
ings using tensors is to represent individual 
relational roles as vectors, role-filler bind- 
ings as tensors, and complete propositions 
as sums of tensors (e.g., Tesar & Smolensky, 
1994). For example, if the vector | repre- 
sents the lover role of the loves relation, b the 
beloved role, j John and m Mary, then loves 
(John, Mary) would be represented by the 
sum ]j + bm, and loves (Mary, John) would 
be the sum Im + bj. 

Tensors provide a basis for representing 
the semantic content of relations (in the case 
of tensors that are isomorphic with SAA) 
or relational roles (in the case of tensors 
based on role-filler bindings) and to repre- 
sent role-filler bindings explicitly. Accord- 
ingly, numerous researchers have argued that 
tensor products and their relatives provide 
an appropriate model of human symbolic 
representations. Halford and his colleagues 
also showed that tensor products based on 
SAA representations provide a natural ac- 
count of the capacity limits of human work- 
ing memory and applied these ideas to ac- 
count for numerous phenomena in relational 
reasoning and cognitive development (see 
Halford, Chap. 22). Tensors are thus at least 
a useful approximation of human relational 
representations. 

However, tensor products and_ their 
relatives have two properties that limit 
their adequacy as a general model of hu- 
man relational representations. First, tensors 
necessarily violate role-filler independence 
(Holyoak & Hummel, 2000; Hummel & 
Holyoak, 2003 a). This is true both of SAA- 


and colleagues) and role-filler binding-based 
tensors (as advocated by Smolensky and col- 
leagues). A tensor product is a product of 
two or more vectors, and so the similarity of 
two tensors (e.g., their inner product or the 
cosine of the angle between them) is equal 
to the product of the similarities of the ba- 
sic vectors from which they are constructed. 
For example, in the case of tensors ab and cd 
formed from vectors a, b, c, and d: 


ab - cd = (a-c)(b- d), (4.1) 


where the “.” denotes the inner product, and 
cos(ab, cd) = cos(a, c)cos(b,d), (4.2) 


where cos(x, y) is the cosine of the angle 
between x and y. 

In other words, two tensor products are 
similar to one another to the extent that their 
roles and fillers are similar to one another. If 
vectors a and c represent relations (or re- 
lational roles) and b and d represent their 
fillers, then the similarity of the ab binding 
to the cd binding is equal to the similarity of 
roles a and c times the similarity of fillers b 
and d. This fact sounds unremarkable at first 
blush. However, consider the case in which 
a and c are identical (for clarity, let us re- 
place them both with the single vector r), 
but b and d are completely unrelated (i.e., 
they are orthogonal, with an inner product 
of zero). In this case, 


(rb- rd) =(r-r)(b-d)=0. (4.3) 


That is, the similarity of rb to rd is zero even 
though both refer to the same relational role. 

This result is problematic for tensor- 
based representations because a connection- 
ist network (and for that matter, probably a 
person) will generalize learning from rb to 
rd to the extent that the two are similar to 
one another. Equation (4.3) shows that, if b 
and d are orthogonal, then rb and rd will be 
orthogonal even though they both represent 
bindings of different arguments to exactly 
the same relational role (r). As a result, ten- 
sor products cannot support relational gener 
alization. The same limitation applies to all 
multiplicative binding schemes (i.e., repre- 
sentations in which the vector representing 
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the vectors representing the bound ele- 
ments), including HRRs, circular convolu- 
tions, and spatter codes (see Hummel & 
Holyoak, 2003 a). 

A second problem for tensor-based repre- 
sentations concerns the representation of the 
semantics of relational roles. Tensors that are 
SAA-isomorphic (e.g., Halford et al., 1998) 
fail to distinguish the semantics of differ- 
ent roles of the relation precisely because 
they are SAA-isomorphic (see Doumas & 
Hummel, 2004): Rather than using sepa- 
rate vectors to represent a relation’s roles, 
SAA-isomorphic tensors represent the rela- 
tion, as a whole, using a single vector. Role- 
filler binding tensors (e.g., as proposed by 
Smolensky and colleagues) do explicitly rep- 
resent the semantic content of the individual 
roles of a relation. However, these represen- 
tations are limited by the summing opera- 
tion that is used to conjoin the separate role- 
filler bindings into complete propositions. 
The result of the summing operation is a “su- 
perposition catastrophe” (von der Malsburg, 
1981) in which the original role-filler bind- 
ings — and therefore the original roles and 
fillers — are unrecoverable (a sum underde- 
termines its addends). 

The deleterious effects of this superpo- 
sition can be minimized by using sparse 
representations in a very high-dimensional 
space (Kanerva, 1998; Plate, 1991). This 
approach works because it minimizes the 
representational overlap between separate 
concepts. However, minimizing the repre- 
sentational overlap also minimizes the pos- 
itive effects of distributed representations 
(which stem from the overlap between rep- 
resentations of similar concepts). In the 
limit, sparse coding becomes equivalent to 
localist conjunctive coding with completely 
separate codes for every possible conjunc- 
tion of roles and fillers. In this case, there 
is no interference between separate bind- 
ings, but neither is there overlap between 
related concepts. Conversely, as the overlap 
between related concepts increases, so does 
the ambiguity of sums of separate role bind- 
ings. The ability to keep separate bindings 
separate thus invariably trades off against 


similar vectors. This trade-off is a symp- 
tom of the fact that tensors are trapped on 
the implicit relations continuum (Hummel & 
Biederman, 1992) — the continuum from 
holistic (localist) to feature-based (dis- 
tributed), vector-based representations of 
concepts — characterizing representational 
schemes that fail to code relations indepen- 
dently of their arguments. 


Role-Filler Binding by Vector Addition 


What is needed is a way to both represent 
roles and their fillers in a distributed fash- 
ion (to capture their semantic content) and 
simultaneously bind roles to their fillers in 
a way that does not violate role-filler inde- 
pendence (to achieve meaningfully symbolic 
representation and thus relational general- 
ization). Tensor products are on the right 
track in the sense that they represent rela- 
tions and fillers in a distributed fashion, and 
they can represent role-filler bindings — just 
not in a way that preserves role-filler inde- 
pendence. Accordingly, in the search for a 
distributed code that preserves role-filler in- 
dependence, it is instructive to consider why, 
mathematically, tensors violate it. 

The reason is that a tensor is a product 
of two or more vectors, and so the value of 
ij element of the tensor is a function of 
the it* value of the role vector and the j* 
element of the filler vector. That is, a tensor 
is the result of a multiplicative interaction 
between two or more vectors. Statistically, 
when two or more variables do not interact — 
that is, when their effects are independent, as 
in the desired relationship between roles and 
their fillers — their effects are additive (rather 
than multiplicative). Accordingly, the way 
to bind a distributed vector, r, representing 
a relational role to a vector, f, representing 
its filler is not to multiply them but to add 
them (Holyoak & Hummel, 2000; Hummel 
& Holyoak, 1997, 20034): 

rf=r+f, (4.4) 
where rf is just an ordinary vector (not 
a tensor).+ 
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monly implemented in the neural network 
modeling community as synchrony of neural 
firing (for reviews, see Hummel & Holyoak, 
1997, 2003a), although it can also be real- 
ized in other ways (e.g., as systematic asyn- 
chrony for firing; Love, 1999). The basic idea 
is that vectors representing relational roles 
fire in synchrony with vectors represent- 
ing their fillers and out of synchrony with 
other role-filler bindings. That is, at each in- 
stant in time, a vector representing a role is 
“added to” (fires with) the vector represent- 
ing its filler. 

Binding by synchrony of firing is much 
reviled in some segments of the connec- 
tionist modeling community. For example, 
Edelman and Intrator (2003) dismissed it 
as an “engineering convenience.” Similarly, 
O'Reilly et al. (2003) dismissed it on the 
grounds that (1) it is necessarily transient 
[i.e., it is not suitable as a basis for stor 
ing bindings in long-term memory (LTM)], 
(2) it is capacity limited (i.e., it is only 
possible to have a finite number of bound 
groups simultaneously active and mutually 
out of synchrony; Hummel & Biederman, 
1992; Hummel & Holyoak, 2003a; Hum- 
mel & Stankiewicz, 1996), and (3) bind- 
ings represented by synchrony of firing must 
ultimately make contact with stored con- 
junctive codes in LTM. These limitations do 
indeed apply to binding by synchrony of 
firing; (1) and (2) are also precisely the lim- 
itations of human working memory (WM) 
(see Cowan, 2000). Limitation (3) is meant 
to imply that synchrony is redundant: If 
you already have to represent bindings con- 
junctively in order to store them in LTM, 
then why bother to use synchrony? The an- 
swer is that synchrony, but not conjunctive 
coding, makes it possible to represent roles 
independently of their fillers and thus al- 
lows symbolic representations and relational 
generalization. 

Despite the objections of Edelman and 
Intrator (2003), O’Reilly et al. (2003), and 
others, there is substantial evidence for bind- 
ing by synchrony in the primate visual cortex 
(see Singer, 2000, for a review) and frontal 
cortex (eg., Desmedt & Tomberg, 1994; 


and the brain may be happy to exploit “en- 
gineering conveniences.” This would be un- 
surprising given the computational benefits 
endowed by dynamic binding (namely, re- 
lational generalization based on distributed 
representations), the ease with which syn- 
chrony can be established in neural systems, 
and the ease with which it can be exploited 
(it is well known that spikes arriving in close 
temporal proximity have superadditive ef- 
fects on the postsynaptic neuron relative to 
spikes arriving at very different times). The 
mapping between the limitations of human 
WM and the limitations of synchrony cited 
by O'Reilly et al. (2003) also constitutes in- 
direct support for the synchrony hypothe- 
sis, as do the successes of models based on 
synchrony (for reviews, see Hummel, 2000; 
Hummel & Holyoak, 2003b; Shastri, 2003). 

However, synchrony of firing cannot be 
the whole story. At a minimum, conjunc- 
tive coding is necessary for storing bindings 
in LTM and forming localist tokens of roles, 
objects, role-filler bindings, and complete 
propositions (Hummel & Holyoak, 1997, 
20034). It seems likely, therefore, that an ac- 
count of the human cognitive architecture 
that includes both “mundane” acts (such as 
shape perception, which actually turns out 
to be relational; Hummel, 2000) and sym- 
bolic cognition (such as planning, reason- 
ing, and problem solving) must incorporate 
both dynamic binding (for independent rep- 
resentation of roles bound to fillers in WM) 
and conjunctive coding (for LTM storage 
and token formation) and specify how they 
are related. 

The remainder of this chapter reviews 
one example of this approach to knowl- 
edge representation — “LISAese,” the rep- 
resentational format used by Hummel and 
Holyoak’s (1992, 1997, 2003 a) LISA (Learn- 
ing and Inference with Schemas and Analo- 
gies) model of analogical inference and 
schema induction — with an emphasis on 
how LISAese permits symbolic representa- 
tions to be composed from distributed (i.e., 
semantically rich) representations of roles 
and fillers and how the resulting representa- 
tions are uniquely suited to simulate aspects 
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Figure 4.2. Representation of propositions in LISAese. Objects and relational roles are represented 
both as patterns of activation distributed over units representing semantic features (semantic units; 
small circles) and as localist units representing tokens of objects (large circles) and relational roles 
(triangles). Roles are bound to fillers by localist subproposition (SP) units (rectangles), and role-filler 
bindings are bound into complete propositions by localist proposition (P) units (ovals). 

(a) Representation of loves (Susan, Jim). (b) Representation of knows [Jim, loves (Susan, Jim)]. When 
one P takes another as an argument, the lower (argument) P serves in the place of an object unit 
under the appropriate SP of the higher-level P unit [in this case, binding loves (Susan, Jim) to the SP 


representing what is known]. 


of human perception and cognition (also see 
Holyoak, Chap. 6). 

LISAese is based on a hierarchy of dis- 
tributed and localist codes that collectively 
represent the semantic features of objects 
and relational roles and their arrangement 
into complete propositions (Figure 4.2). At 
the bottom of the hierarchy, semantic units 
(small circles in Figure 4.2) represent ob- 
jects and relational roles in a distributed 
fashion. For example, Jim might be repre- 
sented by features such as human, and male 
(along with units representing his person- 


ality traits, etc.), and Susan might be rep- 
resented as human and female (along with 
units for her unique attributes). Similarly, 
the lover and beloved roles of the loves re- 
lation would be represented by semantic 
units capturing their semantic content. At 
the next level of the hierarchy, object and 
predicate units (large circles and triangles in 
Figure 4.2) represent objects and relational 
roles in a localist fashion and share bidi- 
rectional excitatory connections with the 
corresponding semantic units. Subproposi- 
tion units (SPs; rectangles in Figure 4.2) 
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represent birfdingseshnt 
arguments [which can ge be objects, as in 
Figure 4.2(a), or complete propositions, as in 
Figure 4.2(b)]. At the top of the hierarchy, 
separate role-filler bindings (i-e., SPs) are 
bound into a localist representation of the 
proposition as a whole via excitatory connec- 
tions to a single proposition (P) unit (ovals 
in Figure 4.2). Representing propositions in 
this type of hierarchy reflects our assump- 
tion that every level of the hierarchy must be 
represented explicitly as an entity in its own 
right (see Hummel & Holyoak, 2003). The 
resulting representational system is com- 
monly referred to as a role-filler binding sys- 
tem (see Halford et al., 1998). Both rela- 
tional roles and their fillers are represented 
explicitly, and relations are represented as 
linked sets of role-filler bindings. Impor- 
tantly, in role-filler binding systems, rela- 
tional roles, their semantics, and their bind- 
ings to their fillers are all made explicit in 
the relational representations themselves. As 
a result, role-filler binding representations 
are not subject to the problems inherent 
in SAA representations discussed previously 
wherein relational roles are left implicit in 
the larger relational structures. 

A complete analog (i.e., story, situation, 
or event) in LISAese is represented by the 
collection of P, SP, predicate, object, and se- 
mantic units that code its propositional con- 
tent. Within an analog, a given object, re- 
lational role, or proposition is represented 
by a single localist unit regardless of how 
many times it is mentioned in the analog 
[e.g., Susan is represented by the same unit 
in both loves (Susan, Jim) and loves (Charles, 
Susan)], but a given element is represented 
by separate localist units in separate analogs. 
The localist units thus represent tokens of in- 
dividual objects, relations, or propositions in 
particular situations (i.e., analogs). A given 
object or relational role will tend to be 
connected to many of the same semantic 
units in all the analogs in which it is men- 
tioned, but there may be small differences 
in the semantic representation, depending 
on context (e.g., Susan might be connected 
to semantics describing her profession in 
an analog that refers to her work and to 


Ni hitter AAS theNAryfeatires specifying her height in an analog 


about her playing basketball; see Hummel 
& Holyoak, 2003 a). Thus, whereas the local- 
ist units represent tokens, the semantic units 
represent types. 

The hierarchy of units depicted in Fig- 
ure 4.2 represents propositions both in 
LISA’s LTM and, when the units become ac- 
tive, in its WM. In this representation, the 
binding of roles to fillers is captured by the 
localist (and conjunctive) SP units. When 
a proposition becomes active, its role-filler 
bindings are also represented dynamically 
by synchrony of firing. When a P unit be- 
comes active, it excites the SPs to which it 
is connected. Separate SPs inhibit one an- 
other, causing them to fire out of synchrony 
with one another. When an SP fires, it acti- 
vates the predicate and object units beneath 
it, and they activate the semantic units be- 
neath themselves. On the semantic units, the 
result is a collection of mutually desynchro- 
nized patterns of activation, one for each role 
binding. For example, the proposition loves 
(Susan, Jim) would be represented by two 
such patterns, one binding the semantic fea- 
tures of Susan to the features of lover, and the 
other binding Jim to beloved. The proposi- 
tion loves (Jim, Susan) would be represented 
by the very same semantic units (as well as 
the same object and predicate units); only 
the synchrony relations would be reversed. 

The resulting representations explicitly 
bind semantically rich representations of 
relational roles to representations of their 
fillers (at the level of semantic features, pred- 
icate and object units, and SPs) and represent 
complete relations as conjunctions of role- 
filler bindings (at the level of P units). As 
a result, they do not fall prey to the short- 
comings of traditional connectionist repre- 
sentations (which cannot dynamically bind 
roles to their fillers), those of SAA (which 
can represent neither relational roles nor 
their semantic content explicitly), or those 
of tensors. 

Hummel, Holyoak, and their colleagues 
have shown that LISAese knowledge rep- 
resentations, along with the operations that 
act on them, account for a very large 
number of phenomena in human relational 
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reasoning eR RAS Phe tips Hine titiemAdyoc oremantic content of the entities they repre- 


ing memory retrieval, analogy making 
(Hummel & Holyoak, 1997), analogical in- 
ference, and schema induction (Hummel & 
Holyoak, 2003 a). They provide a natural ac- 
count of the limitations of human WM, on- 
togenetic and phylogenetic differences be- 
tween individuals and species (Hummel & 
Holyoak, 1997), the relation between ef- 
fortless (“reflexive”; Shastri & Ajjanagadde, 
1993) and more effortful (“reflective”) forms 
of reasoning (Hummel & Choplin, 2000), 
and the effects of frontotemporal degener- 
ation (Morrison et al., 2004; Waltz et al., 
1999) and natural aging (Viskontas et al., 
in press) on reasoning and memory. They 
also provide a basis for understanding the 
perceptual-cognitive interface (Green & 
Hummel, 2004) and how specialized cog- 
nitive “modules” (e.g., for reasoning about 
spatial arrays of objects) can work with 
the broader cognitive architecture in the 
service of specific reasoning tasks (e.g., 
transitive inference; Holyoak & Hummel, 
2000) (see Hummel & Holyoak, 20035, for 
a review). 


Summary 


An explanation of human mental represen- 
tations — and the human cognitive architec- 
ture more broadly — must account both for 
our ability to represent the semantic content 
of relational roles and their fillers and for our 
ability to bind roles to their fillers dynam- 
ically without altering the representation 
of either. 

Traditional symbolic approaches to cog- 
nition capture the symbolic nature of hu- 
man relational representations, but they fail 
to specify the semantic content of roles and 
their fillers — a failing that, as noted by the 
connectionists in the 1980s, renders them 
too inflexible to serve as an adequate ac- 
count of human mental representations, and, 
as shown by Doumas and Hummel (2004), 
appears inescapable. 

Traditional distributed connectionist ap- 
proaches have the opposite strengths and 
weaknesses: They succeed in capturing the 


sent but fail to provide any basis for binding 
those entities together into symbolic (i.e., 
relational) structures. This failure renders 
them incapable of relational generalization. 

Connectionist models that attempt to 
achieve symbolic competence by using ten- 
sor products and other forms of conjunc- 
tive coding as the sole basis for role-filler 
binding find themselves in a strange world 
in between the symbolic and connection- 
ist approaches (i.e., on the implicit relations 
continuum) neither fully able to exploit the 
strengths of the connectionist approach nor 
fully able to exploit the strengths of the sym- 
bolic approach. 

Knowledge representations based on dy- 
namic binding of distributed representations 
of relational roles and their fillers (of which 
LISAese is an example) — in combination 
with a localist representations of roles, fillers, 
role-filler bindings, and their composition 
into complete propositions — can simulta- 
neously capture both the symbolic nature 
and semantic richness of human mental rep- 
resentations. The resulting representations 
are neurally plausible, semantically rich, 
flexible, and meaningfully symbolic. They 
provide the basis for a unified account of hu- 
man memory storage and retrieval, analogi- 
cal reasoning, and schema induction, includ- 
ing a natural account of both the strengths, 
limitations, and frailties of human rela- 
tional reasoning. 
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Notes 


1. Arguments (or roles) may suggest different 
shades of meaning as a function of the roles 
(or fillers) to which they are bound. For exam- 
ple, “loves” suggests a different interpretation 
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in loves Johih IRA ane hes dattasy BA ieianary°( Garther layer of conjunctive coding and further 


chocolate). However, such contextual varia- 
tion does not imply in any general sense that 
the filler (or role) itself necessarily changes its 
identity as a function of the binding. For exam- 
ple, our ability to appreciate that the “John” 
in loves (John, Mary) is the same person as 
the “John” in bites (Rover, John) demands ex- 
planation in terms of John’s invariance across 
the different bindings. If we assume invari- 
ance of identity with binding as the general 
case, then it is possible to explain contextual 
shadings in meaning when they occur (Hum- 
mel & Holyoak, 1997). However, if we assume 
lack of invariance of identity as the general 
case, then it becomes impossible to explain 
how knowledge acquired about an individual 
or role in one context can be connected to 
knowledge about the same individual or role in 
other contexts. 


2. In the most extreme version of this account, 
the individual processing elements are not as- 
sumed to “mean” anything at all in isolation; 
rather they take their meaning only as part of a 
whole distributed pattern. Some limitations of 
this extreme account are discussed by Bowers 

2002) and Page (2000). 

3. For example, Falkenhainer, Forbus, and Gen- 
tner’s (i989) structure matching engine 
(SME), which uses SAA-based representations 
to perform graph matching, cannot map loves 
(Abe, Betty) onto likes (Peter, Bertha) because 
loves and likes are nonidentical predicates. To 
perform this mapping, SME must recast the 
predicates into a common form, such as has- 
affection-for (Abe, Betty) and has-affection-for 
(Alex, Bertha) and then map these identical 
predicates. 


4. At first blush, it might appear that adding two 
vectors where one represents a relational role 
and the other its filler should be susceptible 
to the very same problem that we faced when 
adding two tensors where each represented a 
role-filler binding, namely the superposition 
catastrophe. It is easy to overcome this prob- 
lem in the former case, however, by simply 
using different sets of units to represent roles 
and fillers so the network can distinguish them 
when added (see Hummel & Holyoak, 2003 a). 
This solution might also be applied to role- 
filler binding with tensors, although doing so 
would require using different sets of units to 
code different role-filler bindings. This solu- 
tion would require allocating separate tensors 
to separate role-filler bindings, thus adding a 


violating role-filler independence. 
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CHAPTER 5 
The Problem of Induction 


Steven A. Sloman 
David A. Lagnado 


In its classic formulation, due to Hume 
(1739, 1748), inductive reasoning is an ac- 
tivity of the mind that takes us from the 
observed to the unobserved. From the fact 
that the sun has risen every day thus far, we 
conclude that it will rise again tomorrow; 
from the fact that bread has nourished us 
in the past, we conclude that it will nour 
ish us in the future. The essence of induc- 
tive reasoning lies in its ability to take us be- 
yond the confines of our current evidence or 
knowledge to novel conclusions about the 
unknown. These conclusions may be partic- 
ular, as when we infer that the next swan 
we see will be white, or general, as when we 
infer that all swans are white. They may con- 
cern the future, as in the prediction of rain 
from a dark cloud, or concern something in 
the past, as in the diagnosis of an infection 
from current symptoms. 

Hume argued that all such reasoning is 
founded on the relation of cause and effect. 
It is this relation that takes us beyond our 
current evidence, whether it is an inference 
from cause to effect, or effect to cause, or 
from one collateral effect to another. Having 
identified the causal basis of our inductive 


reasoning, Hume proceeded to raise a funda- 
mental question now known as “the problem 
of induction” — what are the grounds for such 
inductive or causal inferences? In attempting 
to answer this question, Hume presents both 
a negative and a positive argument. 

In his negative thesis, Hume argued that 
our knowledge of causal relations is not 
attainable through demonstrative reasoning, 
but is acquired through past experience. To 
illustrate, our belief that fire causes heat, and 
the expectation that it will do so in the fu- 
ture, is based on previous cases in which one 
has followed the other, and not on any a pri- 
ori reasoning. However, once Hume iden- 
tified experience as the basis for inductive 
inference, he proceeded to demonstrate its 
inadequacy as a justification for these infer- 
ences. Put simply, any such argument re- 
quires the presupposition that past experi- 
ence will be a good guide to the future, and 
this is the very claim we seek to justify. 

For Hume, what is critical about our 
experience is the perceived similarity be- 
tween particular causes and their effects: 
“From causes, which appear similar, we ex- 
pect similar effects. This is the sum of all our 
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Son, Chap. 2). However, this expectation 
cannot be grounded in reason alone because 
similar causes could conceivably be followed 
by dissimilar effects. Moreover, if one intro- 
duces hidden powers or mechanisms to ex- 
plain our observations at a deeper level, the 
problem just gets shifted down. What guar- 
antees that the powers or mechanisms that 
underlie our current experiences will do so 
in the future? 

In short, Hume’s negative argument un- 
dermines the assumption that the future will 
resemble the past. This assumption cannot 
be demonstrated a priori because it is not 
contradictory to imagine that the course of 
nature may change. However, neither can it 
be supported by an appeal to past experience 
because this would be to argue in a circle. 

Hume’s argument operates at two levels, 
both descriptive and justificatory. At the de- 
scriptive level, it suggests that there is no ac- 
tual process of reflective thought that takes 
us from the observed to the unobserved. 
After all, as Hume points out, even young 
infants and animals make such inductions, 
although they clearly do not use reflective 
reasoning. At the justificatory level, it sug- 
gests that there is no possible line of rea- 
soning that could do so. Thus, Hume argues 
both that reflective reasoning does not and 
could not determine our inductive inferences. 

Hume’s positive argument provides an 
answer to the descriptive question of how 
we actually pass from the unobserved to the 
observed but not to the justificatory one. 
He argues that it is custom or habit that 
leads us to make inferences in accordance 
with past regularities. Thus, after observing 
many cases of a flame being accompanied 
by heat, a novel instance of a flame creates 
the idea, and hence an expectation, of heat. 
In this way, a correspondence is set up be- 
tween the regularities in the world and the 
expectations of the mind. Moreover, Hume 
maintains that this tendency is “implanted 
in us as an instinct” because nature would 
not entrust it to the vagaries of reason. In 
modern terms, then, we are prewired to ex- 
pect past associations to hold in the future, 
although what is associated with what will 


This idea of a general-purpose associative 
learning system has inspired many contem- 
porary accounts of inductive learning (see 
Buehner & Cheng, Chap. 7). 

Hume’s descriptive account suffers from 
several shortcomings. For one, it seems to as- 
sume there is an objective sense of similarity 
or resemblance that allows us to pass from 
like causes to like effects, and vice versa. In 
fact, a selection from among many dimen- 
sions of similarity might be necessary for a 
particular case. For example, to what degree 
and in what respects does a newly encoun- 
tered object (e.g., a new type of candy bar) 
need to be similar to previously encountered 
objects for someone to expect a similar prop- 
erty (like a similar taste)? If we are to acquire 
any predictive habits, we must be able to 
generalize to some extent from one object 
to another, or to the same object at differ 
ent times and contexts. How this is carried 
out is as much in need of a descriptive ac- 
count as the problem of induction itself. Sec- 
ond, we might accept that no reflective rea- 
soning can justify our inductive inferences, 
but this does not entail that reflective rea- 
soning cannot be the actual cause of some 
of our inferences. Nevertheless, Hume pre- 
sciently identified the critical role of both 
similarity and causality in inductive reason- 
ing, the variables that, as we will see, are 
at the heart of work on the psychology of 
induction. 

Hume was concerned with questions of 
both description and justification. In con- 
trast, the logical empiricists (e.g., Carnap, 
1950, 1966; Hempel, 1965; Reichenbach, 
1938) focused only on justification. Having 
successfully provided a formal account of de- 
ductive logic (Frege, 1880; Russell & White- 
head, 1925) in which questions of deductive 
validity were separated from how people ac- 
tually make deductive inferences (see Evans, 
Chap. 8), philosophers attempted to do the 
same for inductive inference by formulating 
rules for an inductive logic. 

Central to this approach is the belief 
that inductive logic, like deductive logic, 
concerns the logical relations that hold be- 
tween statements irrespective of their truth 
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however, these relations admit of varying 
strengths, a conditional probability measure 
reflecting the rational degree of belief that 
someone should have in a hypothesis given 
the available evidence. For example, the hy- 
pothesis that “all swans are white” is made 
probable (to degree p) by the evidence state- 
ment that “all swans in Central Park are 
white.” On this basis, the logical empiricists 
hoped to codify and ultimately justify the 
principles of sound inductive reasoning. 

This project proved to be fraught with dif- 
ficulties, even for the most basic inductive 
rules. Thus, consider the rule of induction by 
enumeration, which states that a universal 
hypothesis H, is confirmed or made proba- 
ble by its positive instances E. The problem 
is that these very same instances will also 
confirm a different universal hypothesis H, 
(indeed, an infinity of them), which makes 
an entirely opposite prediction about sub- 
sequent cases. The most notorious illustra- 
tion of this point was provided by Goodman 
(1955) and termed “the new riddle of in- 
duction.” Imagine that you have examined 
numerous emeralds and found them all to 
be colored green. You take this body of evi- 
dence E to confirm (to some degree) the hy- 
pothesis that “All emeralds are green.” How- 
ever, suppose we introduce the predicate 
“srue,” which applies to all objects examined 
so far (before time t) and found to be green 
and to all objects not examined and blue. 
Given this definition and the rule that a uni- 
versal hypothesis is confirmed by its positive 
instances, our evidence set E also confirms 
the gruesome hypothesis “All emeralds are 
grue.” However, this is highly undesirable 
because each hypothesis makes an entirely 
different prediction as to what will happen 
in the future (after time f), when we ex- 
amine a new emerald. Goodman stated this 
problem as one of projectibility: How can we 
justify or explain our preference to project 
predicates such as “green” from past to fu- 
ture instances, rather than predicates such 
as “grue”? 

Many commentators object that the 
problem hinges on the introduction of a 
bizarre predicate, but the same point can be 


or simply in terms of functions (see Hempel, 
1965 ). Indeed, the problem of drawing a line 
or curve through a finite set of data points 
illustrates the same difficulty. Two curves C, 
and C, may fit the given data points equally 
well but diverge otherwise. According to the 
simple inductive rule, both are equally con- 
firmed and yet we often prefer one curve 
over the other. Unfortunately, an inductive 
logic of the kind proposed by Carnap (1950) 
gives us no grounds to decide which predi- 
cate (or curve) to project. 

In general, then, Goodman’s (1955) prob- 
lem of projectibility concerns how we distin- 
guish projectible predicates such as “green” 
from nonprojectible ones such as “grue.” Al- 
though he concurred with Hume’s claim 
that induction consists of a mental habit 
formed by past regularities, he argued that 
Hume overlooked the further problem (the 
new riddle) of which past regularities are se- 
lected by this mental habit and thus pro- 
jected in the future. After all, it would 
appear that we experience a vast range of 
regularities and yet are prepared to project 
only a small subset. Goodman himself of- 
fered a solution in terms of entrenchment. In 
short, a predicate is entrenched if it has a past 
history of use, where both the term itself 
and the extension of the term, figure in this 
usage. Thus, “green” is entrenched, whereas 
“srue” is not because our previous history of 
projections involves numerous cases of the 
former, but none of the latter. In common 
with Hume, then, Goodman gave a descrip- 
tive account of inductive inference, but one 
grounded in the historic practices of people, 
and in particular their language use, rather 
than simply the psychology of an individual. 

One shortcoming of Goodman’s proposal 
is that it hinges on language use. Ultimately, 
he attempted to explain our inductive prac- 
tices in terms of our linguistic practices: “the 
roots of inductive validity are to be found 
in our use of language.” However, surely in- 
ductive questions, such as the problem of 
projectibility, arise and are solved by infants 
and animals without language (see Suppes, 
1994). Indeed, our inductive practices may 
drive our linguistic practices, rather than 
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ruled out, or at least overlooked, the pos- 
sibility that the notions of similarity and 
causality are integral to the process of in- 
ductive reasoning. However, as we will see, 
more recent analyses suggest that these are 
the concepts that will give us the most lever- 
age on the problem of induction. 

In his essay, “Natural Kinds” (i970), 
Quine defended a simple and intuitive an- 
swer to Goodman’s problem: Projectible 
predicates apply to members of a kind, a 
grouping formed on the basis of similarity. 
Thus, “green” is projectible, whereas “grue” 
is not because green things are more simi- 
lar than grue things; that is, green emeralds 
form a kind, whereas grue emeralds do not. 
This shifts the explanatory load onto the 
twin notions of similarity and kind, which 
Quine held to be fundamental to inductive 
inference: “every reasonable expectation de- 
pends on similarity.” For Quine, both hu- 
mans and animals possess an innate stan- 
dard of similarity useful for making appro- 
priate inductions. Without this prior notion, 
no learning or generalization can take place. 

Despite the subjectivity of this primitive 
similarity standard, Quine believed that its 
uniformity across humans makes the induc- 
tive learning of verbal behavior relatively 
straightforward. What guarantees, however, 
that our “innate subjective spacing of quali- 
ties” matches up with appropriate groupings 
in nature? Here, Quine appealed to an evolu- 
tionary explanation: Without such a match, 
and thus the ability to make appropriate in- 
ductions, survival is unlikely. 

Like Hume, then, Quine proposed a nat- 
uralistic account of inductive inference, but 
in addition to the instinctive habit of associa- 
tion, he proposed an innate similarity space. 
Furthermore, Quine argued that this primi- 
tive notion of similarity is supplemented, as 
we advance from infant to adult and from 
savage to scientist, by ever more developed 
senses of “theoretical” similarity. The devel- 
opment of such theoretical kinds by the re- 
grouping of things, or the introduction of 
entirely new groupings, arises through “trial- 
and-error theorizing.” In Goodman’s terms, 
novel projections on the basis of second- 


cessful. Although this progress from primi- 
tive to theoretical similarity may actually en- 
gender a qualitative change in our reasoning 
processes, the same inductive tendencies ap- 
ply throughout. Thus, whether we infer heat 
from a flame, or a neutrino from its path in a 
bubble chamber, or even the downfall of an 
empire from the dissatisfaction of its work- 
ers, all such inferences rest on our propensity 
to group kindred entities and project them 
into the future on this basis. 

For Quine, our notions of similarity and 
the way in which we group things become 
increasingly sophisticated and abstract, cul- 
minating, he believed, in their eventual re- 
moval from mature science altogether. This 
conclusion seems to sit uneasily with his 
claims about theoretical similarity. Never- 
theless, as mere humans, we will always be 
left with a spectrum of similarity notions 
and systems of kinds applicable as the con- 
text demands, which accounts for the coex- 
istence of a variety of procedures for carrying 
out inductive inference, a plurality that ap- 
pears to be echoed in more recent cognitive 
psychology (e.g., Cheng & Holyoak, 1985). 

Both Goodman and Quine said little 
about the notion of causality. This is prob- 
ably a hangover from the logical empiricist 
view of science that sought to avoid all ref- 
erence to causal relations in favor of logical 
ones. Contemporary philosophical accounts 
have striven to reinstate the notion of causal- 
ity into induction (Glymour, 2001; Lipton, 
1991; Miller, 1987). 

Miller (1987) and Lipton (1991) provided 
numerous examples of inductive inferences 
that depend on the supposition of, or ap- 
peal to, causal relations. Indeed, Miller pro- 
posed a definition of inductive confirmation 
as causal comparison: Hypotheses are con- 
firmed by appropriate causal accounts of the 
data-gathering process. Armed with this no- 
tion, he claimed that Goodman’s new rid- 
dle of induction is soluble. It is legitimate to 
project “green” but not "grue” because only 
“green” is consistent with our causal knowl- 
edge about color constancy and the belief 
that no plausible causal mechanism sup- 
ports spontaneous color change. He argued 
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reasoning must allow for the influence of 
causal beliefs. Further development of such 
an account, however, awaits a satisfactory 
theory of causality (for recent advances, see 
Pearl, 2000). 

In summary, tracing the progress of philo- 
sophical analyses suggests a blueprint for a 
descriptive account of inductive reasoning — 
a mind that can extract relations of similarity 
and causality and apply them to new cate- 
gories in relevant ways. In subsequent sec- 
tions, we argue that this is the same pic- 
ture that is emerging from empirical work in 
psychology. 


Empirical Background 


Experimental work in psychology on how 
people determine the projectibility of a 
predicate has its roots in the study of general- 
ization in learning. Theories of learning were 
frequently attempts to describe the shape of 
a generalization gradient for a simple predi- 
cate applied to an even simpler class often 
defined by a single dimension. For exam- 
ple, if an organism learned that a tone pre- 
dicts food, one might ask how the organism 
would respond to other tones. The function 
describing how a response (such as saliva- 
tion) varies with the similarity of the stimu- 
lus to the originally trained stimulus is called 
a generalization gradient. Shepard (1987) ar- 
gued that such functions are invariably neg- 
atively exponential in shape. 

If understood as general theories of induc- 
tion, such theories are necessarily reduction- 
ist in orientation. Because they only consider 
the case of generalization along specific di- 
mensions that are closely tied to the senses 
(often spectral properties of sound or light), 
the assumption is, more or less explicitly, 
that more complex predicates can be de- 
composed into sets of simpler ones. The pro- 
jectibility of complex predicates is thus be- 
lieved to be reducible to generalization along 
more basic dimensions. 

Reductionism of this kind is highly re- 
strictive. It requires that there exist some 


which all complex concepts of objects and 
predicates can be aligned. This requirement 
has been by and large rejected for many rea- 
sons. One problem is that concepts tend to 
arise in systems, not individually. Even a sim- 
ple linguistic predicate like “is small” is con- 
strued very differently when applied to mice 
and when applied to elephants. Many pred- 
icates that people reason about are emer 
gent properties whose existence depends on 
the attitude of a reasoning agent (consider 
“is beautiful” or a cloud that “looks like a 
mermaid”). So we cannot simply represent 
predicates as functions of simpler perceptual 
properties. Something else is needed, some- 
thing that respects the information we have 
about predicates via the relations of objects 
and predicates to one another. 

In the 1970s, the answer proffered was 
similarity (see Goldstone & Son, Chap. 2). 
The additional information required to 
project a predicate was the relative posi- 
tion of a category with respect to other 
categories; the question about one category 
could be decided based on knowledge of 
the predicate’s relation to other (similar) 
categories (see Medin & Rips, Chap. 3). 
Prior to the 1970s, similarity had gener- 
ally been construed as a distance in a fairly 
low-dimensional space (Shepard, 1980). In 
1977, Tversky proposed a new measure that 
posited that similarity could be computed 
over a large number of dimensions, that both 
common and distinctive features were essen- 
tial to determine the similarity between any 
pair of objects, and, critically, that the set 
of features used to measure similarity were 
context dependent. Features depended on 
their diagnosticity in the set of objects be- 
ing compared and on the specific task used to 
measure similarity. Tversky’s contrast model 
of similarity would, it was hoped, prove 
to have sufficient representational power to 
model a number of cognitive tasks, including 
categorization and induction. 

The value of representing category struc- 
ture in terms of similarity was reinforced 
by Rosch’s (1973) efforts to construct a 
similarity-based framework for understand- 
ing natural categories. Her seminal work on 
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the basic level of hierarchical category struc- 
ture provided the empirical basis for her ar- 
guments that categories were mentally rep- 
resented in a way that carved the world at 
its joints. She imagined categories as clusters 
in a vast high-dimensional similarity space 
that were devised to maximize the similar 
ity within a cluster and minimize the simi- 
larity between clusters. Her belief that the 
structure of this similarity space was given 
by the world and was not simply a matter of 
subjective opinion implies that the similar- 
ity space contains a lot of information that 
can be used for a number of tasks, including 
inductive inference. 

Rosch (1978) suggested that the main 
purpose of category structure was to pro- 
vide the evidential base for relating predi- 
cates to categories. She attempted to moti- 
vate the basic level as the level of hierarchical 
structure that maximized the usefulness of a 
cue for choosing a category, what she called 
cue validity, the probability of a category 
given a cue. Basic-level categories were pre- 
sumed to maximize cue validity by virtue 
of being highly differentiated; members of 
a basic-level category have more common 
attributes than members of a superordinate, 
and they have fewer common attributes with 
other categories than do members of a subor- 
dinate. Murphy (1982) observed, however, 
that this will not work. The category with 
maximum probability given a cue is the most 
general category possible (“entity”), whose 
probability is 1 (or at least close to it). How- 
ever, Rosch’s idea can be elaborated using a 
measure of inductive projectibility in a way 
that succeeds in picking out the basic level. 
If the level of a hierarchy is selected by ap- 
pealing to the inductive potential of the cat- 
egory, say by maximizing category validity, 
the probability of a specific feature given a 
category, then one is driven in the opposite 
direction of cue validity, namely to the most 
specific level. Given a particular feature, one 
is pretty much guaranteed to choose a cate- 
gory with that feature by choosing a specific 
object known to have the feature. By trading 
off category and cue validity, the usefulness 
of a category for predicting a feature and of 
a feature for predicting a category, one can 


structure. Jones (1983 ) made this suggestion, 
calling it a measure of “collocation.” A more 
sophisticated information-theoretic analysis 
along these lines is presented in Corter and 
Gluck (1992) and Fisher (1987). 

Another quite different but complemen- 
tary line of work going on at about the 
same time as Rosch’s, with related implica- 
tions for inductive inference, was Tversky 
and Kahneman’s (1974) development of 
the representativeness heuristic of proba- 
bility and frequency judgment. The rep- 
resentativeness heuristic is essentially the 
idea that categorical knowledge is used to 
make probability judgments (see Kahneman 
& Frederick, Chap. 12). In that sense, it is an 
extension of Rosch’s insights about category 
structure. She showed that similarity was 
a guiding principle in decisions about cat- 
egory membership; Kahneman and Tversky 
showed that probability judgment could, in 
some cases, be understood as a process of 
categorization driven by similarity. To illus- 
trate, Linda is judged more likely to be a fem- 
inist bankteller than a bankteller (despite the 
conjunction rule of probability that disal- 
lows this conclusion) if she has characteristic 
feminist traits (i.e., if she seems like she is a 
member of the category of feminists). 

In sum, the importance of similarity for 
how people make inductive inferences was 
recognized in the 1970s in the study of nat- 
ural category structure and probability judg- 
ment and manifested in the development of 
models of similarity. Rips (1975) put these 
strands together in the development of a cat- 
egorical induction task. He told people that 
all members of a particular species of animal 
on a small island had a particular contagious 
disease and asked participants to guess what 
proportion of other species would also have 
the disease. For example, if all rabbits have it, 
what proportion of dogs would? Rips found 
that judgments went up with the similarity 
of the two categories and with the typicality 
of the first (premise) category. 

Relatively little work on categorical in- 
duction was performed by cognitive psy- 
chologists immediately following Rips’s 
seminal work. Instead, the banner was pur- 
sued by developmental psychologists such as 
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schema that children learn through devel- 
opment and how they use those schema to 
make inductive inferences across categories. 
In particular, she showed that adults and 
10-year-olds used general biological knowl- 
edge to guide their inductions about novel 
animal properties, whereas small children 
based their inductions on knowledge about 
humans. Gelman and Markman (1986) ar- 
gued that children prefer to make inductive 
inferences using category structure rather 
than superficial similarity. However, it was 
the theoretical discussion and mathematical 
models of Osherson and his colleagues, dis- 
cussed in what follows, that led to an ex- 
plosion of interest by cognitive psychologists 
with a resulting menu of models and phe- 
nomena to constrain them. 


Scope of Chapter 


To limit the scope of this chapter, in the 
remainder we focus exclusively on the psy- 
chology of categorical induction: How peo- 
ple arrive at a statement of their confidence 
that a conclusion category has a predicate 
after being told that one or more premise 
categories do. As Goodman’s (1955) analysis 
makes clear, this is a very general problem. 
Nevertheless, we do not address a number 
of issues related to induction. For example, 
we do not address how people go about se- 
lecting evidence to support a hypothesis (see 
Doherty et al., 1996; Klayman & Ha, 1987; 
Oaksford & Chater, 1994). We do not ad- 
dress how people discover hypotheses but 
rather focus only on their degree of cer- 
tainty in a prespecified hypothesis (cf. the 
distinction between the contexts of discov- 
ery and confirmation; Reichenbach, 1938). 
This rules out a variety of work on the topic 
of hypothesis discovery (e.g., Klahr, 2000; 
Klayman, 1988). Relatedly, we do not cover 
the variety of work on the topic of cue learn- 
ing, that is, how people learn the predictive 
or diagnostic value of stimuli (see Buehner 
& Cheng, Chap. 7). 

Most of our discussion concerns the eval- 
uation of categorical arguments, such as 


Boys use GABA as a neurotransmitter. 


Therefore, girls use GABA as a neu- 
rotransmitter. 


that can be written schematically as a list of 
sentences: 


Piss PafC (5.1) 


in which the P; are the premises of an argu- 
ment and C is the conclusion. Each state- 
ment includes a category (eg., boys) to 
which is applied a predicate (e.g., use GABA 
as a neurotransmitter). In most of the exam- 
ples discussed, the categories will vary across 
statements, whereas the predicate will re- 
main constant. The general question will be 
how people go about determining their be- 
lief in the conclusion of such an argument af- 
ter being told that the premises are true. We 
discuss this question both by trying to de- 
scribe human judgment as a set of phenom- 
ena and by trying to explain the existence 
of these phenomena in terms of more fun- 
damental and more general principles. The 
phenomena will concern judgments of the 
strength of categorical arguments or the con- 
vincingness of an argument or some other 
measure of belief in the conclusion once the 
premises are given (reviewed by Heit, 2000). 

One way to represent the problem we 
address is in terms of conditional probabil- 
ity. The issue can be construed in terms of 
how people make judgments of the follow- 
ing form: 


P(Category C has some property | 
Categories P, . . . P, have the property) 


Indeed, some of the tasks we discuss involve 
a conditional probability judgment explic- 
itly. But even those that do not, such as ar- 
gument strength, can be directly related to 
judgments of conditional probability. 

Most of the experimental work we ad- 
dress attempts to restrict attention to how 
people use categories to reason by minimiz- 
ing the role of the predicate in the reasoning 
process. To achieve this, arguments are usu- 
ally restricted to “blank” predicates — pred- 
icates that use relatively unfamiliar terms 
(eg., “use GABA as a neurotransmitter”) so 
they do not contribute much to how people 
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Smith, Wilkie, Lopez, & Shafir, 1990). They 
do contribute some, however. For instance, 
all the predicates applied to animals are ob- 
viously biological in nature, thus suggest- 
ing that the relevant properties for reason- 
ing are biological. Lo, Sides, Rozelle, and 
Osherson (2002) characterized blank pred- 
icates as “indefinite in their application to 
given categories, but clear enough to com- 
municate the kind of property in question” 
(p. 183). 

Philosophers such as Carnap (1950) and 
Hacking (2001) have distinguished inten- 
sional and extensional representations of 
probability (sometimes called epistemic vs. 
aleatory representations). Correspondingly, 
in psychology we can distinguish modes 
of inference that depend on assessment of 
similarity structure and modes that depend 
on analyses of set structure [see Lagnado 
& Sloman, (2004), for an analysis of the 
correspondence between the philosophical 
and psychological distinctions]. We refer 
to the former as the inside view of cate- 
gory structure and the latter as the out- 
side view (Sloman & Over, 2003; Tversky 
& Kahneman, 1983). In this chapter, we fo- 
cus on induction from the inside via simi- 
larity structure. We thus neglect a host of 
work concerning, for example, how people 
make conditional probability judgments in 
the context of well-defined sample spaces 
(e.g., Johnson-Laird et al., 1999), reasoning 
using explicit statistical information (eg., 
Nisbett, 1993), and the relative advantages 
of different kinds of representational format 
(e.g., Tversky & Kahneman, 1983). 


Two Theoretical Approaches 
to Inductive Reasoning 


A number of theoretical approaches have 
been taken to the problem of categorical in- 
duction in psychology. Using broad strokes, 
the approaches can be classified into two 
groups: similarity-based induction and induc- 
tion as scientific methodology. We discuss each 
in turn. As becomes clear, the approaches are 
not mutually exclusive both because they 


at different levels of abstraction. 


Similarity-Based Induction 


Perhaps the most obvious and robust pre- 
dictor of inductive strength is similarity. In 
the simplest case, most people are willing 
to project a property known to be true of 
(say) crocodiles to a very similar class, such 
as alligators, with some degree of confidence. 
Such willingness exists either because simi- 
larity isa mechanism of induction (Osherson 
et al., 1990) or because induction and sim- 
ilarity judgment have some common an- 
tecedent (Sloman, 1993). From the scores of 
examples of the representativeness heuris- 
tic at work (Tversky & Kahneman, 1974) 
through Rosch’s (1973) analysis of typicality 
in terms of similarity, a strong correlation be- 
tween probability and similarity is more the 
rule than the exception. The argument has 
been made that similarity is not a real expla- 
nation at all (Goodman, 1972; see the review 
in Sloman & Rips, 1998) and phenomena ex- 
ist that contradict prediction based only on 
similarity (e.g., Gelman & Markman, 1986). 
Nevertheless, similarity remains the key con- 
struct in the description and explanation of 
inductive phenomena. 

Consider the similarity and typicality 
phenomena (Lopez, Atran, Coley, Medin, 
& Smith, 1997; Osherson et al., 1990; Rips, 


1975): 
Similarity 
Arguments are strong to the extent that 


categories in the premises are similar to 
the conclusion category. For example, 


Robins have sesamoid bones. 


Therefore, sparrows have sesamoid 
bones. 


is judged stronger than 


Robins have sesamoid bones. 


Therefore, ostriches have sesamoid 
bones. 


because robins are more similar to spar- 
rows than to ostriches. 


Typicality 
The more typical premise categories are 
of the conclusion category, the stronger 
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are more willing to project a predicate 
from robins to birds than from penguins 
to birds because robins are more typical 
birds than penguins. 


The first descriptive mathematical ac- 
count of phenomena like these expressed 
argument strength in terms of similarity. 
Osherson et al. (1990) posited the similarity- 
coverage model that proposed that people 
make categorical inductions on the basis of 
two principles, similarity and category cover 
age. Category coverage was actually cashed 
out in terms of similarity. According to the 
model, arguments are deemed strong to the 
degree that premise and conclusion cate- 
gories are similar and to the degree that 
premises “cover” the lowest-level category 
that includes both premise and conclusion 
categories. The idea is that the categories 
present in the argument elicit their com- 
mon superordinate — in particular, the most 
specific superordinate that they share. Cate- 
gory coverage is determined by the similar 
ity between the premise categories and all 
the categories contained in this lowest-level 
superordinate. 

Sloman (i993) proposed a competing 
theory of induction that reduces the two 
principles of similarity and category cov- 
erage into a single principle of feature 
coverage. Instead of appealing to a class in- 
clusion hierarchy of superordinates and sub- 
ordinates, this theory appeals to the extent of 
overlap among the properties of categories. 
Predicates are projected from premise cate- 
gories to a conclusion category to the degree 
that the previously known properties of the 
conclusion category are also properties of the 
premise categories — specifically, in propor- 
tion to the number of conclusion category 
features that are present in the premise cat- 
egories. Both models can explain the simi- 
larity, typicality, and asymmetry phenomena 


(Rips, 1975): 


Asymmetry 

Switching premise and conclusion cate- 
gories can lead to arguments of different 
strength: 


Tigers have 38 chromosomes. 


Therefore, buffaloes have 38 chromo- 
somes. 


is judged stronger than 


Buffaloes have 38 chromosomes. 


Therefore, tigers have 38 chromo- 
somes. 


The similarity-coverage model explains it 
by appealing to typicality. Tigers are more 
typical mammals than buffaloes and there- 
fore tigers provide more category coverage. 
The feature-based model explains it by ap- 
pealing to familiarity. Tigers are more fa- 
miliar than buffaloes and therefore have 
more features. So the features of tigers cover 
more of the features of buffaloes than vice 
versa. 

Differences between the models play out 
in the analysis of several phenomena. The 
similarity-coverage model focuses on rela- 
tions among categories; the feature-based 
model on relations among properties. Con- 
sider diversity (Osherson et al., 1990): 


Diversity 

The less similar premises are to each 
other, the stronger the argument tends to 
be. People are more willing to draw the 
conclusion that all mammals love onions 
from the fact that hippos and hamsters 
love onions than from the fact that hip- 
pos and rhinos do because hippos and 
rhinos are more similar than hippos and 
hamsters. 


The phenomenon has been demonstrated 
on several occasions with Western adults 
(eg., Lopez, 1995), although some evi- 
dence suggests the phenomenon does not 
always generalize to other groups. Lopez 
et al. (1997) failed to find diversity ef- 
fects among Itza’ Maya. Proffitt, Coley, and 
Medin (2000) found that parks mainte- 
nance workers did not show diversity effects 
when reasoning about trees, although tree 
taxonomists did. Bailenson, Shum, Atran, 
Medin, and Coley (2002) did not find di- 
versity effects with either Itza’ Maya or 
bird experts. There is also some evidence 
that children are not sensitive to diversity 
(Carey, 1985; Gutheil & Gelman, 1997; 
Lopez, Gelman, Gutheil, & Smith, 1992). 
However, using materials of greater interest 
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find diversity effects with 5- and 6-year-olds. 
The data show only mixed support for the 
phenomenon. Nevertheless, it is predicted 
by the similarity-coverage model. Categories 
that are less similar will tend to cover the su- 
perordinate that includes them better than 
categories that are more similar. The feature- 
based model also predicts the phenomenon 
as a result of feature overlap. When cate- 
gories differ, their features have relatively 
little overlap, and thus they cover a larger 
part of feature space; when categories are 
similar, their coverage of feature space is 
more redundant. As a result, more dissim- 
ilar premises are more likely to show more 
overlap with a conclusion category. How- 
ever, this is not necessarily so and, indeed, 
the feature-based model predicts a bound- 
ary condition on diversity (Sloman, 1993): 


Feature exclusion 

A premise category that has little over- 
lap with the conclusion category should 
have no effect on argument strength 
even if it leads to a more diverse set of 
premises. For example, 


Fact: German Shepherds have sesa- 
moid bones. 


Fact: Giraffes have sesamoid bones. 


Conclusion: Moles have sesamoid 
bones. 


is judged stronger than 


Fact: German Shepherds have sesa- 
moid bones. 


Fact: Blue whales have sesamoid bones. 


Conclusion: Moles have sesamoid 
bones. 


even though the second argument has a 
more diverse set of premises than the first. 
The feature-based model explains this by ap- 
pealing to the lack of feature overlap be- 
tween blue whales and moles over and above 
the overlap between German Shepherds and 
moles. To explain this phenomenon, the 
similarity-coverage model must make the ad 
hoc assumption that blue whales are not sim- 
ilar enough to other members of the lowest- 
level category, including all categories in 
the arguments (presumably mammals), 


giraffes. 


Monotonicity and Nonmonotonicity 
When premise categories are sufficiently 
similar, adding a supporting premise will 
increase the strength of an argument. 
However, a counterexample to mono- 
tonicity occurs when a premise with a 
category dissimilar to all other categories 
is introduced: 


Crows have strong sternums. 
Peacocks have strong sternums. 


Therefore, birds have strong sternums. 
is stronger than 


Crows have strong sternums. 
Peacocks have strong sternums. 
Rabbits have strong sternums. 


Therefore, birds have strong sternums. 


The similarity-coverage model explains non- 
monotonicity through its coverage term. 
The lowest-level category that must be cov- 
ered in the first argument is birds because all 
categories in the argument are birds. How- 
ever, the lowest-level category that must be 
covered in the second argument is more 
general — animals — because rabbits are not 
birds. Worse, rabbits are not similar to very 
many animals; therefore, the category does 
not contribute much to argument strength. 
The feature-based model cannot explain this 
phenomenon except with added assump- 
tions — for example, that the features of 
highly dissimilar premise categories com- 
pete with one another — as explanations for 
the predicate (see Sloman, 1993). 

As the analysis of nonmonotonicities 
makes clear, the feature-coverage model dif- 
fers from the similarity-coverage model pri- 
marily in that it appeals to properties of cat- 
egories rather than instances in explaining 
induction phenomena and, as a result, in not 
appealling to the inheritance relations of a 
class inclusion hierarchy. That is, it assumes 
people will not in general infer that a cate- 
gory has a property because its superordi- 
nate does. Instead, it assumes that people 
think about categories in terms of their struc- 
tural relations, in terms of property over- 
lap and relations among properties. This 
is surely the explanation for the inclusion 
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Inclusion Fallacy 

Similarity relations can override categor- 
ical relations between conclusions. Most 
people judge 


All robins have sesamoid bones. 


Therefore, all birds have sesamoid 
bones. 


to be stronger than 


All robins have sesamoid bones. 


Therefore, all ostriches have sesamoid 
bones. 


Of course, ostriches are birds, and so the first 
conclusion implies the second; therefore, the 
second argument must be stronger than the 
first. Nevertheless, robins are highly typical 
birds and therefore similar to other birds. 
Yet they are distinct from ostriches. These 
similarity relations determine most people’s 
judgments of argument strength rather than 
the categorical relation. 

An even more direct demonstration of 
failure to consider category inclusion rela- 
tions is the following (Sloman, 1993, 1998): 


Inclusion Similarity 

Similarity relations can override even 
transparent categorical relations be- 
tween premise and conclusion. People do 
not always judge 


Every individual body of water has a 
high number of seiches. 


Every individual lake has a high num- 
ber of seiches. 


to be perfectly strong even when they 
agree that a lake is a body of water. More- 
over, they judge 


Every individual body of water has a 
high number of seiches. 


Every individual reservoir has a high 
number of seiches. 


to be even weaker, presumably because 
reservoirs are less typical bodies of water 
than lakes. 


These examples suggest that category inclu- 
sion knowledge has only a limited role in 


Py"COlimductive inference. This might be related 


to the limited role of inclusion relations in 
other kinds of categorization tasks. For ex- 
ample, Hampton (1982) showed intransitiv- 
ities in category verification using everyday 
objects. He found, for example, that people 
affirmed that “A car headlight is a kind of a 
lamp” and that “A lamp is a kind of furni- 
ture,” but not “A car headlight is a kind of 
furniture.” 

People are obviously capable of infer- 
ring a property from a general to a more 
specific category. Following an explanation 
that appeals to inheritance is not difficult (I 
know naked mole rats have livers because all 
mammals have livers). However, the inclu- 
sion fallacy and the inclusion similarity phe- 
nomenon show that such information is not 
inevitably, and therefore, not automatically 
included in the inference process. 

Gelman and Markman showed that 
children use category labels to mediate 
induction: 


Naming effect 

Children prefer to project predicates be- 
tween objects that look similar rather 
than objects that look dissimilar. How- 
ever, this preference is overridden when 
the dissimilar objects are given similar 


labels. 


Gelman and Coley (1990) showed that chil- 
dren as young as 2 years old are also sensitive 
to the use of labels. So, on the one hand, peo- 
ple are extremely sensitive to the informa- 
tion provided by labels when making induc- 
tive inferences. On the other hand, the use 
of structured category knowledge for induc- 
tive inference seems to be a derivative abil- 
ity, not a part of the fabric of the reasoning 
process. This suggests that the naming effect 
does not concern how people make infer- 
ences using knowledge about category struc- 
ture per se, because if the use of structural 
knowledge is not automatic, very young chil- 
dren would not be expected to use it. Rather, 
the effect seems to be about the pragmatics 
of language — in particular, how people use 
language to mediate induction. The nam- 
ing effect probably results from people’s ex- 
treme sensitivity to experimenters’ linguistic 
cues. Even young children apparently have 
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menter gives two objects similar labels, the 
experimenter is giving a hint, a hint that the 
objects should be treated similarly at least in 
the context of the experiment. This ability 
to take cues from others, and to use language 
to do so, may well be key mechanisms of hu- 
man induction. 

This is also the conclusion of cross- 
cultural work by Coley, Medin, and Atran 
(1997). Arguments are judged stronger the 
more specific the categories involved. If told 
that dalmations have an ulnar artery, peo- 
ple are more willing to generalize ulnar ar- 
teries to dogs than to animals (Osherson 
et al., 1990). Coley et al. (1997) compared 
people’s willingness to project predicates 
from various levels of the hierarchy of liv- 
ing things to a more general level. For ex- 
ample, when told that a subspecific category 
such as “male black spider monkey” is sus- 
ceptible to an unfamiliar disease, did partic- 
ipants think that the members of the folk- 
specific category “black spider monkey” were 
susceptible? If members of the specific cat- 
egory were susceptible, then were members 
of the folk-generic category (“spider mon- 
key”) also susceptible? If members of the 
generic category were susceptible, then were 
members of the life-form category (“mam- 
mal”) also susceptible? Finally, if the life- 
form category displayed susceptibility, then 
did the kingdom (“animal”)? Coley et al. 
found that both American college students 
and members of a traditional Mayan village 
in lowland Guatemala showed a sharp drop 
off at a certain point: 


Preferred level of induction 

People are willing to make an inductive 
inference with confidence from a subor- 
dinate to a near superordinate up to the 
folk-generic level; their willingness drops 
off considerably when making inferences 
to categories more abstract. 


These results are consistent with Berlin’s 
(1992) claim that the folk-generic level is 
the easiest to identify, the most commonly 
distinguished in speech, and serves best to 
distinguish categories. Therefore, one might 
imagine that the folk-generic level would 


are often used to organize hierarchical lin- 
guistic and conceptual categories (Brown, 
1958; Rosch et al., 1976; see Murphy, 2002, 
for a review). Nevertheless, the dominance 
of generic categories was not expected by 
Coley et al. (1997) because Rosch et al. 
(1976) had found that for the biological cate- 
gories tree, fish, and bird, the life-form level 
was the category level satisfying a number 
of operational definitions of the basic level. 
For example, Rosch et al.’s American col- 
lege students preferred to call objects they 
were shown “tree,” “fish,” or “bird” rather 
than “oak,” “salmon,” or “robin.” 

Why the discrepancy? Why do American 
college students prefer to name an object a 
tree over an oak, yet prefer to project a prop- 
erty from all red oaks to all oaks rather than 
from all oaks to all trees? Perhaps they simply 
cannot identify oaks, and therefore fall back 
on the much more general “tree” in order 
to name. However, this begs the question: 
If students consider “tree” to be informative 
and precise enough to name things, why are 
they unwilling to project properties to it? 
Coley et al.’s (1997) answer to this conun- 
drum is that naming depends on knowledge; 
that is, names are chosen that are precise 
enough to be informative given what peo- 
ple know about the object being named. In- 
ductive inference, they argued, also depends 
on a kind of conventional wisdom. People 
have learned to maximize inductive poten- 
tial at a particular level of generality (the 
folk-generic) level because culture and lin- 
guistic convention specify that that is the 
most informative level for projecting prop- 
erties (see Greenfield, Chap. 27). For exam- 
ple, language tends to use a single morpheme 
for naming generic level categories. This is 
a powerful cue that members of the same 
generic level have a lot in common and that 
therefore it is a good level for guessing that 
a predicate might hold across it. This idea is 
related to Shipley’s (1993) notion of overhy- 
potheses (cf. Goodman, 1955): that people 
use categorywide rules about certain kinds 
of properties to make some inductive in- 
ferences. For example, upon encountering a 
new species, people might assume members 
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obesity than in, say, skin color (Nisbett et al., 
1983) despite having no particular knowl- 
edge about the species. 

This observation poses a challenge to 
feature- and similarity-based models of in- 
duction (Heit, 1998; Osherson et al., 1990; 
Sloman, 1993). These models all start from 
the assumption that people induce new 
knowledge about categories from old knowl- 
edge about the same categories. However, if 
people make inductive inferences using not 
only specific knowledge about the categories 
at hand but also distributional knowledge 
about the likelihood of properties at differ- 
ent hierarchical levels, knowledge that is in 
part culturally transmitted via language, then 
more enters the inductive inference pro- 
cess than models of inductive process have 
heretofore allowed. 

Mandler and McDonough (1998) argued 
that the basic-level bias comes relatively late, 
and demonstrated that 14-month-old infants 
show a bias to project properties within a 
broad domain (animals or vehicles) rather 
than at the level usually considered to be 
basic. This finding is not inconsistent with 
Coley et al.’s (1997) conclusion because 
the distributional and linguistic properties 
that they claim mediate induction presum- 
ably have to be learned, and so finding a 
basic-level preference only amongst adults 
is sufficient for their argument. Mandler 
and McDonough (1998) argued that infants’ 
predilection to project to broad domains 
demonstrates an initial propensity to rely 
on “conceptual” as opposed to “perceptual” 
knowledge as a basis for induction, meaning 
that infants rely on the very abstract com- 
monalities among animals as opposed to the 
perhaps more obvious physical differences 
among basic-level categories (pans vs. cups 
and cats vs. dogs). Of course, pans and cups 
do have physical properties in common that 
distinguish them from cats and dogs (e.g., 
the former are concave, the latter have ar 
ticulating limbs). Moreover, the distinction 
between perceptual and conceptual prop- 
erties is tenuous. Proximal and distal stim- 
uli are necessarily different (i.e., even the 
eye engages in some form of interpretation), 


about what is being perceived affects what 
is perceived (eg., Gregory, 1973). Never- 
theless, as suggested by the following phe- 
nomena, induction is mediated by knowl- 
edge of categories’ role in causal systems; 
beliefs about the way the world works influ- 
ence induction as much as overlap of prop- 
erties does. Mandler and McDonough’s data 
provide evidence that this is true even for 
14-month-olds. 


Induction as Scientific Methodology 


Induction is of course not merely the 
province of individuals trying to accom- 
plish everyday goals, but also one of the 
main activities of science. According to one 
common view of science (Carnap, 1966; 
Hempel, 1965; Nagel, 1961; for opposing 
views, see Hacking, 1983; Popper, 1963), sci- 
entists spend much of their time trying to 
induce general laws about categories from 
particular examples. It is natural, therefore, 
to look to the principles that govern induc- 
tion in science to see how well they describe 
individual behavior (for a discussion of sci- 
entific reasoning, see Dunbar & Fugelsang, 
Chap. 29). Psychologists have approached 
induction as a scientific enterprise in three 
different ways. 


THE RULES OF INDUCTION 


First, some have examined the extent to 
which people abide by the normative rules 
of inductive inference that are generally ac- 
cepted in the scientific community. One 
such rule is that properties that do not vary 
much across category instances are more 
projectible across the whole category than 
properties that vary more. Nisbett et al. 
(1983) showed that people are sensitive to 
this rule: 


Variability/Centrality 

People are more willing to project predi- 
cates that tend to be invariant across cat- 
egory instances than variable predicates. 
For example, people who are told that 
one Pacific island native is overweight 
tend to think it is unlikely that all na- 
tives of the island are overweight because 
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contrast, if told the native has dark skin, 
they are more likely to generalize to all 
natives because skin color tends to be 
more uniform within a race. 


However, sensitivity to variability does 
not imply that people consider the variabil- 
ity of predicates in the same deliberative 
manner that a scientist should. This phe- 
nomenon could be explained by a sensitivity 
to centrality (Sloman, Love, & Ahn, 1998). 
Given two properties A and B, such that B 
depends on A but A does not depend on 
B, people are more willing to project prop- 
erty A than property B because A is more 
causally central than B, even if A and B 
are equated for variability (Hadjichristidis, 
Sloman, Stevenson, & Over, 2004). More 
central properties tend to be less variable. 
Having a heart is more central and less vari- 
able among animals than having hair. Cen- 
trality and variability are almost two sides of 
the same coin (the inside and outside views, 
respectively). In Nisbett et al.’s case, having 
dark skin may be seen as less variable than 
obesity by virtue of being more central and 
having more apparent causal links to other 
features of people. 

The diversity principle is sometimes iden- 
tified as a principle of good scientific prac- 
tice (e.g., Heit & Hahn, 2001; Hempel, 1965; 
Lépez, 1995). Yet, Lo et al. (2002) argued 
against the normative status of diversity. 
They consider the following argument: 


House cats often carry the parasite 
Floxum. 


Field mice often carry the parasite 
Floxum. 


All mammals often carry the parasite 
Floxum. 
which they compare to 


House cats often carry the parasite 
Floxum. 


Tigers often carry the parasite Floxum. 


All mammals often carry the parasite 
Floxum. 


Even though the premise categories of the 
first argument are more diverse (house cats 


the second argument might seem stronger 
because house cats could conceivably be- 
come infected with the parasite Floxum 
while hunting field mice. Even if you do not 
find the second argument stronger, merely 
accepting the relevance of this infection sce- 
nario undermines the diversity principle, 
which prescribes that the similarity principle 
should be determinative for all pairs of ar- 
guments. At minimum, it shows that the di- 
versity principle does not dominate all other 
principles of sound inference. 

Lo et al. (2002) proved that a different 
and simple principle of argument strength 
does follow from the Bayesian philosophy 
of science. Consider two arguments with 
the same conclusion in which the conclu- 
sion implies the premises. For example, the 
conclusion “every single mammal carries 
the parasite Floxum” implies that “every sin- 
gle tiger carries the parasite Floxum” (on the 
assumption that “mammal” and “tiger” re- 
fer to natural, warm-blooded animals). In 
such a case, the argument with the less 
likely premises should be stronger. Lo et al. 
referred to this as the premise probability 
principle. In a series of experiments, they 
show that young children in both the United 
States and Taiwan make judgments that con- 
form to this principle. 


INDUCTION AS NAIVE SCIENTIFIC THEORIZING 


A second approach to induction as a scien- 
tific methodology examines the contents of 
beliefs, what knowledge adults and children 
make use of when making inductive infer- 
ences. Because knowledge is structured in a 
way that has more or less correspondence 
to the structure of modern scientific theo- 
ries, sometimes to the structure of old or 
discredited scientific theories, such knowl- 
edge is often referred to as a “naive the- 
ory” (Carey, 1985; Gopnik & Meltzoff, 1997; 
Keil, 1989; Murphy & Medin, 1985). One 
strong, contentful position (Carey, 1985) is 
that people are born with a small num- 
ber of naive theories that correspond to a 
small number of domains such as physics, 
biology, psychology, and so on, and that all 
other knowledge is constructed using these 
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example, other knowledge is a eephod 
extension of these original naive theories (cf. 
Lakoff & Johnson, 1980). 

One phenomenon studied by Carey 
(1985) to support this position is 


Human bias 

Small children prefer to project a prop- 
erty from people rather than from other 
animals. Four-year-olds are more likely 
to agree that a bug has a spleen if told 
that a person does than if told that a 
bee does. Ten-year-olds and adults do 
not show this asymmetry and project as 
readily from nonhuman animals as from 
humans. 


Carey argued that this transition is due to a 
major reorganization of the child’s knowl- 
edge about animals. Knowledge is consti- 
tuted by a mutually constraining set of con- 
cepts that make a coherent whole in analogy 
to the holistic coherence of scientific the- 
ories. As a result, concepts do not change 
in isolation, but instead as whole networks 
of belief are reorganized (Kuhn, 1962). On 
this view, the human bias occurs because 
a 4-year-old’s understanding of biological 
functions is framed in terms of human be- 
havior, whereas older children and adults 
possess an autonomous domain of biologi- 
cal knowledge. 

A different enterprise is more descriptive; 
it simply shows the analogies between 
knowledge structures and scientific theories. 
For example, Gopnik and Meltzoff (1997) 
claimed that, just like scientists, both chil- 
dren and laypeople construct and revise 
abstract lawlike theories about the world. 
In particular, they maintain that the gen- 
eral mechanisms that underlie conceptual 
change in cognitive development mirror 
those responsible for theory change in ma- 
ture science. More specifically, even very 
young children project properties among 
natural kinds on the basis of latent, underly- 
ing commonalities between categories rather 
than superficial similarities (e.g., Gelman & 
Coley, 1990). So children behave like “little 
scientists” in the sense that their inductive 
inferences are more sensitive to the causal 
principles that govern objects’ composition 


By"COmMmd behavior than to objects’ mere appear- 


ance, even though appearance is, by defini- 
tion, more directly observable. 

Of course, analogies between everyday 
induction and scientific induction have to 
exist. As long as both children and scien- 
tists have beliefs that have positive induc- 
tive potential, those beliefs are likely to have 
some correspondence to the world, and the 
knowledge of children and scientists will 
therefore have to show some convergence. 
If children did operate merely on the basis 
of superficial similarities, such things as pho- 
tographs and toy cars would forever stump 
them. Children have no choice but to be 
“little scientists,” merely to walk around the 
world without bumping into things. Because 
of the inevitability of such correspondences 
and because scientific theories take a multi- 
tude of different forms, it is not obvious that 
this approach, in the absence of a more fully 
specified model, has much to offer theories 
of cognition. Furthermore, proponents of 
this approach typically present a rather im- 
poverished view of scientific activity, which 
neglects the role of social and cultural norms 
and practices (see Faucher et al., 2002). Ef- 
forts to give the approach a more principled 
grounding have begun (e.g., Gopnik et al., 
2004; Rehder & Hastie, 2001; Sloman, Love, 
& Ahn, 1998). 

Lo et al. (2002) rejected the approach 
outright. They argue that it just does 
not matter whether people have repre- 
sentational structures that in one way or 
another are similar to scientific theories. 
The question that they believe has both 
prescriptive value for improving human 
induction and descriptive value for develop- 
ing psychological theory is whether what- 
ever method people use to update their be- 
liefs conforms to principles of good scientific 
practice. 


COMPUTATIONAL MODELS OF INDUCTION 


The third approach to induction as a sci- 
entific methodology is concerned with the 
representation of inductive structure with- 
out concern for the process by which peo- 
ple make inductive inferences. The approach 
takes its lead from Marr’s (1982) analysis of 
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sis. Models at the highest level, those that 
concern themselves with a description of the 
goals of a cognitive system without direct de- 
scription of the manner in which the mind 
tries to attain those goals or how the system 
is implemented in the brain, are computa- 
tional models. Three kinds of computational 
models of inductive inference have been sug- 
gested, all of which find their motivation in 
principles of good scientific methodology. 


Induction as Hypothesis Evaluation Mc- 
Donald, Samuels, and Rispoli (1996) pro- 
posed an account of inductive inference that 
appeals to several principles of hypothesis 
evaluation. They argued that when judging 
the strength of an inductive argument, peo- 
ple actively construct and assess hypothe- 
ses in light of the evidence provided by the 
premises. They advanced three determinants 
of hypothesis plausibility: the scope of the 
conclusion, the number of premises that in- 
stantiate it, and the number of alternatives to 
it suggested by the premises. In their experi- 
ments, all three factors were good predictors 
of judged argument strength, although cer- 
tain pragmatic considerations, and a fourth 
factor — “acceptability of the conclusion” — 
were also invoked to fully cover the results. 

Despite the model’s success in explain- 
ing some judgments, others, such as non- 
monotonicity, are only dealt with by appeal 
to pragmatic postulates that are not de- 
fended in any detail. Moreover, the model 
is restricted to arguments with general con- 
clusions. Because the model is at a com- 
putational level of description, it does not 
make claims about the cognitive processes 
involved in induction. As we see next, other 
computational models do offer something 
in place of a process model that McDonald 
et al.’s (1996) framework does not: a rigor- 
ous normative analysis of an inductive task. 


Bayesian models of inductive inference 
Heit (1998) proposed that Bayes’ rule pro- 
vides a representation for how people de- 
termine the probability of the conclusion of 
a categorical inductive argument given that 
the premises are true. The idea is that peo- 
ple combine degrees of prior belief with the 


posterior degree of belief in the conclusion. 
Prior beliefs concern relative likelihoods that 
each combination of categories in the argu- 
ment would all have the relevant property. 
For example, for the argument 


Cows can get disease X. 


Sheep can get disease X. 


Heit assumes people can generate beliefs 
about the relative prior probability that both 
cows and sheep have the disease, that cows 
do but sheep do not, and so on. These be- 
liefs are generated heuristically; people are 
assumed to bring to mind properties shared 
by cows and by sheep, properties that cows 
have but sheep do not, and so on. The prior 
probabilities reflect the ease of bringing each 
type of property to mind. Premises contri- 
bute other information as well — in this case, 
that only states in which cows indeed have 
the disease are possible. This can be used to 
update priors to determine a posterior de- 
gree of belief that the conclusion is true. 

On the basis of assumptions about what 
people’s priors are, Heit (1998) described 
a number of the phenomena of categori- 
cal induction: similarity, typicality, diversity, 
and homogeneity. However, the model is 
inconsistent with nonmonotonicity effects. 
Furthermore, because it relies on an exten- 
sional updating rule, Bayes’ rule, the model 
cannot explain phenomena that are nonex- 
tensional such as the inclusion fallacy or the 
inclusion-similarity phenomenon. 

Sanjana and Tenenbaum (2003) offered a 
Bayesian model of categorical inference with 
a more principled foundation. The model is 
applied only to the animal domain. They de- 
rive all their probabilities from a hypothesis 
space that consists of clusters of categories. 
The model’s prediction for each argument 
derives from the probability that the conclu- 
sion category has the property. This reflects 
the probability that the conclusion category 
is an element of likely hypotheses — namely, 
that the conclusion category is in the same 
cluster as the examples shown (i.e., as the 
premise categories) and that those hypoth- 
esized clusters have high probability. The 
probability of each hypothesis is assumed to 
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pothesis (the number of animal types it in- 
cludes) and to its complexity, the number of 
disjoint clusters that it includes. This model 
performed well in quantitative compar- 
isons against the similarity-coverage model 
and the feature-based model, although its 
consistency with the various phenomena 
of induction has not been reported and is 
rather opaque. 

The principled probabilistic foundation 
of this model and its good fit to data so far 
yield promise that the model could serve 
as a formal representation of categorical in- 
duction. The model would show even more 
promise and power to generalize, however, if 
its predictions had been derived using more 
reasonable assumptions about the structure 
of categorical knowledge. The pairwise clus- 
ter hierarchy Sanjana and Tenenbaum use 
to represent knowledge of animals is poorly 
motivated (although see Kemp & Tenen- 
baum, 2003, for animprovement), and there 
would be even less motivation in other do- 
mains (cf. Sloman, 1998). Moreover, if and 
how the model could explain fallacious rea- 
soning is not clear. 


SUMMARY OF INDUCTION AS SCIENTIFIC 
METHODOLOGY 
Inductive inference can be fallacious, as 
demonstrated by the inclusion fallacy de- 
scribed previously. Nevertheless, much of 
the evidence that has been covered in this 
section suggests that people in the psychol- 
ogist’s laboratory are sensitive to some of the 
same concerns as scientists when they make 
inductive inferences. People are more likely 
to project nonvariable over variable predi- 
cates, they change their beliefs more when 
premises are a priori less likely, and their be- 
havior can be modeled by probabilistic mod- 
els constructed from rational principles. 
Other work reviewed shows that peo- 
ple, like scientists, use explanations to medi- 
ate their inference. They try to understand 
why a category should exhibit a predicate 
based on nonobservable properties. These 
are valuable observations to allow psychol- 
ogists to begin the process of building a 
descriptive theory of inductive inference. 


too few constraints on the cognitive pro- 
cesses and procedures that people actu- 
ally use. 


Conclusions and Future Directions 


We have reviewed two ways that cognitive 
scientists have tried to describe how peo- 
ple make inductive inferences. We limited 
the scope of the problem to that of cat- 
egorical induction — how people generate 
degrees of confidence that a predicate ap- 
plies to a stated category from premises con- 
cerning other categories that the predicate is 
assumed to apply to. Nevertheless, neither 
approach is a silver bullet. The similarity- 
based approach has produced the most well- 
specified models and phenomena, although 
consideration of the relation between scien- 
tific methodology and human induction may 
prove the most important prescriptively and 
may in the end provide the most enduring 
principles to distinguish everyday human in- 
duction from ideal — or at least other — in- 
ductive processes. 

A more liberal way to proceed is to ac- 
cept the apparent plurality of procedures 
and mechanisms that people use to make in- 
ductions and to see this pluralism as a virtue 
rather than a vice. 


The Bag of Tricks 


Many computational problems are hard be- 
cause the search space of possible answers 
is so large. Computer scientists have long 
used educated guesses or what are often 
called heuristics or rules of thumb to prune 
the search space, making it smaller and 
thus more tractable at the risk of making 
the problem insoluble by pruning off the 
best answers. The work of Kahneman and 
Tversky imported this notion of heuristics 
into the study of probability judgment (see 
Kahneman & Frederick, Chap. 12). They 
suggested that people use a set of cognitive 
heuristics to estimate probabilities — heuris- 
tics that were informed, that made people’s 
estimates likely to be reasonable, but left 
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cases in which the heuristics that came nat- 
urally to people had the unfortunate conse- 
quence of leading to the wrong answer. 

Kahneman and Tversky suggested the 
heuristics of availability, anchoring and ad- 
justment, simulation, and causality to de- 
scribe how people make probability judg- 
ments. They also suggested that people make 
judgments according to representativeness, 
the degree to which a class or event used 
as evidence is similar to the class or process 
being judged. Representativeness is a very 
abstract heuristic that is compatible with a 
number of different models of the judgment 
process. We understand it not so much as 
a particular claim about how people make 
probability judgments as the claim that pro- 
cesses of categorization and similarity play 
central roles in induction. This is precisely 
the claim of the similarity-based model out- 
lined previously. 

We believe that the bag of tricks describes 
most completely how people go about mak- 
ing inductive leaps. People seem to use a 
number of different sources of information 
for making inductive inferences, including 
the availability of featural information and 
knowledge about feature overlap, linguistic 
cues about the distribution of features, the 
relative centrality of features to one another, 
the relative probability of premises, and ob- 
jects’ roles in causal systems. 


Causal Induction 


Our guess is that the treasure trove for fu- 
ture work in categorical induction is in the 
development of the latter mode of infer- 
ence. How do people go about using causal 
knowledge to make inductions? That they 
do is indisputable. Consider the following 
phenomenon due to Heit and Rubinstein 


(1994): 


Relevance 

People’s willingness to project a predi- 
cate from one category to another de- 
pends on what else the two categories 
have in common. For example, people 
are more likely to project “has a liver with 
two chambers” from chickens to hawks 


to project “prefers to feed at night” from 
tigers to hawks than from chickens to 


hawks. 


More specifically, argument strength de- 
pends on how people explain why the cat- 
egory has the predicate. In the example, 
chickens and hawks are known to have bi- 
ological properties in common, and there- 
fore, people think it likely that a biologi- 
cal predicate would project from one to the 
other. Tigers and hawks are known to both 
be hunters and carnivores; therefore “prefers 
to feed at night” is more likely to project be- 
tween them. Sloman (1994) showed that the 
strength of an argument depends on whether 
the premise and conclusion are explained in 
the same way. If the premise and conclusion 
have different explanations, the premise can 
actually reduce belief in the conclusion. 
The explanations in these cases are causal; 
they refer to more or less well-understood 
causal processes. Medin, Coley, Storms, and 
Hayes (2003) have demonstrated five dis- 
tinct phenomena that depend on causal intu- 
itions about the relations amongst categories 
and predicates. For example, they showed 


Causal asymmetry 

Switching premise and conclusion cate- 
gories will reduce the strength of an argu- 
ment if a causal path exists from premise 
to conclusion. For example, 


Gazelles contain retinum. 
Lions contain retinum. 


is stronger than 


Lions contain retinum. 
Gazelles contain retinum. 


because the food chain is such that lions 
eat gazelles and retinum could be trans- 
ferred in the process. 


What is striking about this kind of example 
is the exquisite sensitivity to subtle (if mun- 
dane) causal relations that it demonstrates. 
The necessary causal explanation springs to 
mind quickly, apparently automatically, and 
it does so even though it depends on one 
fact that most people are only dimly aware 
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number of facts that are at our disposal. 

We do not interpret the importance of 
causal relations in induction as support for 
psychological essentialism, the view that 
people base judgments concerning cate- 
gories on attributions of “essential” qualities: 
of a true underlying nature that confers 
kind identity unlike, for example, Kornblith 
(1993), Medin and Ortony (1989), and 
Gelman and Hirschfeld (1999). We rather 
follow Strevens (2001) in the claim that it 
is causal structure per se that mediates in- 
duction; no appeal to essential properties 
is required (cf. Rips, 2001; Sloman & Malt, 
2003). Indeed, the causal relations that sup- 
port inductive inference can be based on 
very superficial features that might be very 
mutable. To illustrate, the argument 


Giraffes eat leaves of type X. 
African tawny eagles eat leaves of type X. 


seems reasonably strong only because both 
giraffes and African eagles can reach high 
leaves and both are found in Africa — hardly 
a central property of either species. 

The appeal to causal structure is instead 
intended to appeal to the ability to pick 
out invariants and act as agents to make 
use of those invariants. Organisms have a 
striking ability to find the properties of 
things that maximize their ability to predict 
and control, and humans seem to have the 
most widely applicable capacity of this sort. 
However, prediction and control come from 
knowing what variables determine the val- 
ues of other variables — that is, how one pre- 
dicts future outcomes and knows what to 
manipulate to achieve an effect. This is, of 
course, the domain of causality. It seems only 
natural that people would use this talent to 
reason when making inductive inferences. 

The appeal to causal relations is not nec- 
essarily an appeal to scientific methodology. 
In fact, some philosophers such as Russell 
(1913) argued that theories are not scientific 
until they are devoid of causal reference, and 
the logical empiricists attempted to exorcise 
the notion of causality from “scientific” phi- 
losophy. Of course, to the extent that sci- 
entists behave like other people in their ap- 


methodology is trivial. 

Normative models of causal structure 
have recently flowered (cf Pearl, 2000; 
Spirtes, Glymour, & Scheines, 1993), and 
some of the insights of these models seem 
to have some psychological validity (Sloman 
& Lagnado, 2004). Bringing them to bear 
on the problem of inductive inference will 
not be trivial. However, the effort should be 
made because causal modeling seems to be 
a critical element of the bag of tricks that 
people use to make inductive inferences. 
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CHAPTER 6 


Analogy 


Keith J. Holyoak 


Analogy is a special kind of similarity (see 
Goldstone & Son, Chap. 2). Two situations 
are analogous if they share a common pat- 
tern of relationships among their constituent 
elements even though the elements them- 
selves differ across the two situations. Typi- 
cally, one analog, termed the source or base, is 
more familiar or better understood than the 
second analog, termed the target. This asym- 
metry in initial knowledge provides the ba- 
sis for analogical transfer, using the source to 
generate inferences about the target. For ex- 
ample, Charles Darwin drew an analogy be- 
tween breeding programs used in agriculture 
to select more desirable plants and animals 
and “natural selection” for new species. The 
well-understood source analog called atten- 
tion to the importance of variability in the 
population as the basis for change in the dis- 
tribution of traits over successive generations 
and raised a critical question about the tar- 
get analog: What plays the role of the farmer 
in natural selection? (Another analogy, be- 
tween Malthus’ theory ofhuman population 
growth and the competition of individuals in 
a species to survive and reproduce, provided 
Darwin’s answer to this question.) Analo- 


gies have figured prominently in the history 
of science (see Dunbar & Fugelsang, Chap. 
29) and mathematics (Pask, 2003) and are of 
general use in problem solving (see Novick & 
Bassok, Chap. 14). In legal reasoning, the use 
of relevant past cases (legal precedents) to 
help decide a new case is a formalized appli- 
cation of analogical reasoning (see Ellsworth, 
Chap. 28). Analogies can also function to in- 
fluence political beliefs (Blanchette & Dun- 
bar, 2001) and to sway emotions (Thagard 
& Shelley, 2001). Analogical reasoning goes 
beyond the information initially given, using 
systematic connections between the source 
and target to generate plausible, although 
fallible, inferences about the target. Analogy 
is thus a form of inductive reasoning (see 
Sloman & Lagnado, Chap. 5). 

Figure 6.1 sketches the major compo- 
nent processes in analogical transfer (see 
Carbonell, 1983; Gentner, 1983; Gick & 
Holyoak, 1980, 1983; Novick & Holyoak, 
1991). Typically, a target situation serves 
as a retrieval cue for a potentially useful 
source analog. It is then necessary to es- 
tablish a mapping, or a set of systematic 
correspondences that serve to align the 
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Figure 6.1. Major components of analogical 
reasoning. 


elements of the source and target. On the 
basis of the mapping, it is possible to de- 
rive new inferences about the target, thereby 
elaborating its representation. In the after- 
math of analogical reasoning about a pair of 
cases, it is possible that some form of rela- 
tional generalization may take place, yielding 
a more abstract schema for a class of situa- 
tions, of which the source and target are both 
instances. For example, Darwin’s use of anal- 
ogy to construct a theory of natural selection 
ultimately led to the generation of a more ab- 
stract schema for a selection theory, which 
in turn helped to generate new specific the- 
ories in many fields, including economics, 
genetics, sociobiology, and artificial intelli- 
gence. Analogy is one mechanism for effect- 
ing conceptual change (see Chi & Ohlsson, 
Chap. 16). 


A Capsule History 


The history of the study of analogy in- 
cludes three interwoven streams of research, 
which respectively emphasize analogy in re- 
lation to psychometric measurement of in- 


of knowledge. 


Psychometric Tradition 


Work in the psychometric tradition focuses 
on four-term or “proportional” analogies in 
the form A:B::C:D, such as HAND: FIN- 
GER :: FOOT: ?, where the problem is to 
infer the missing D term (TOE) that is re- 
lated to C in the same way B is related to 
A (see Sternberg, Chap. 31). Thus A:B plays 
the role of source analog and C:D plays the 
role of target. Proportional analogies were 
discussed by Aristotle (see Hesse, 1966) and 
in the early decades of modern psychology 
became a centerpiece of efforts to define 
and measure intelligence. Charles Spearman 
(1923, 1927) argued that the best account 
of observed individual differences in cogni- 
tive performance was based on a general or 
g factor, with the remaining variance being 
unique to the particular task. He reviewed 
several studies that revealed high correla- 
tions between performance in solving anal- 
ogy problems and the g factor. Spearman’s 
student John C. Raven (1938) developed the 
Raven’s Progressive Matrices Test (RPM), 
which requires selection of a geometric fig- 
ure to fill an empty cell in a two-dimensional 
matrix (typically 3 x 3) of such figures. Sim- 
ilar to a geometric proportional analogy, the 
RPM requires participants to extract and ap- 
ply information based on visuospatial rela- 
tions. (See Hunt, 1974, and Carpenter, Just, 
& Shell, 1990, for analyses of strategies for 
solving RPM problems.) The RPM proved to 
be an especially pure measure of g. 
Raymond Cattell (1971), another student 
of Spearman, elaborated his mentor’s the- 
ory by distinguishing between two compo- 
nents of g: crystallized intelligence, which de- 
pends on previously learned information or 
skills, and fluid intelligence, which involves 
reasoning with novel information. As a form 
of inductive reasoning, analogy would be 
expected to require fluid intelligence. Cat- 
tell confirmed Spearman’s (1946) observa- 
tion that analogy tests and the RPM pro- 
vide sensitive measures of g, clarifying that 
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Figure 6.2. Multidimensional scaling solution based on intercorrelations among the Raven’s 
Progressive Matrices test, analogy tests, and other common tests of cognitive function. (From Snow, 
Kyllonen, & Marshalek, 1984, p. 92. Reprinted by permission.) 


they primarily measure fluid intelligence 
(although verbal analogies based on diffi- 
cult vocabulary items also depend on crys- 
tallized intelligence). Figure 6.2 graphically 
depicts the centrality of RPM performance 
in a space defined by individual differences 
in performance on various cognitive tasks. 
Note that numeric, verbal, and geometric 
analogies cluster around the RPM at the cen- 
ter of the figure. 

Because four-term analogies and the RPM 
are based on small numbers of relatively 
well-specified elements and relations, it is 
possible to manipulate the complexity of 
such problems systematically and analyze 
performance (based on response latencies 
and error rates) in terms of component 


processes (e.g., Mulholland, Pellegrino, & 
Glaser, 1980; Sternberg, 1977). The earli- 
est computational models of analogy were 
developed for four-term analogy problems 
(Evans, 1968; Reitman, 1965). The basic 
components of these models were elabora- 
tions of those proposed by Spearman (1923), 
including encoding of the terms, accessing 
a relation between the A and B terms, and 
evoking a comparable relation between the 
C and D terms. 

More recently, four-term analogy prob- 
lems and the RPM have figured promi- 
nently in neuropsychological and neu- 
roimaging studies of reasoning (e.g., Bunge, 
Wendelken, Badre & Wagner, 2004; Kroger 
et al., 2002; Luo et al., 2003; Prabhakaran 
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et al., 2000). Analogical reasoning depends 
on working memory (see Morrison, Chap. 
19). The neural basis of working memory in- 
cludes the dorsolateral prefrontal cortex, an 
area of the brain that becomes increasingly 
activated as the complexity of the problem 
(measured in terms of number of relations 
relevant to the solution) increases. It has 
been argued that this area underlies the fluid 
component of Spearman’s g factor in intelli- 
gence (Duncan et al., 2000), and it plays an 
important role in many reasoning tasks (see 


Goel, Chap. 20). 


Metaphor 


Analogy is closely related to metaphor and 
related forms of symbolic expression that 
arise in everyday language (e.g., “the evening 
of life,” “the idea blossomed”), in literature 
(Holyoak, 1982), the arts, and cultural prac- 
tices such as ceremonies (see Holyoak & 
Thagard, 1995, Chap. 9). Similar to anal- 
ogy in general, metaphors are characterized 
by an asymmetry between target (conven- 
tionally termed “tenor”’) and source (“ve- 
hicle”) domains (eg., the target/tenor in 
“the evening of life” is life, which is un- 
derstood in terms of the source/vehicle of 
time of day). In addition, a mapping (the 
“grounds” for the metaphor) connects the 
source and target, allowing the domains to 
interact to generate a new conceptualiza- 
tion (Black, 1962). Metaphors are a special 
kind of analogy in that the source and tar- 
get domains are always semantically distant 
(Gentner, 1982; Gentner, Falkenhainer, & 
Skorstad, 1988), and the two domains are 
often blended rather than simply mapped 
(e.g., in “the idea blossomed,” the target 
is directly described in terms of an action 
term derived from the source). In addition, 
metaphors are often combined with other 
symbolic “figures” — especially metonymy 
(substitution of an associated concept). 
For example, “sword” is a metonymic ex- 
pression for weaponry, derived from its 
ancient association as the prototypical 
weapon — “Raising interests rates is the Fed- 
eral Reserve Board’s sword in the battle 


into metaphor. 

Fauconnier and Turner (1998; Fauconnier, 
2001) analyzed complex conceptual blends 
that are akin to metaphor. A typical exam- 
ple is a description of the voyage of a mod- 
ern catamaran sailing from San Francisco to 
Boston that was attempting to beat the speed 
record set by a clipper ship that had sailed 
the same route over a century earlier. A 
magazine account written during the cata- 
maran’s voyage said the modern boat was 
“barely maintaining a 4.5 day lead over the 
ghost of the clipper Northern Light. ...” Fau- 
connier and Turner observed that the maga- 
zine writer was describing a “boat race” that 
never took place in any direct sense; rather, 
the writer was blending the separate voy- 
ages of the two ships into an imaginary race. 
The fact that such conceptual blends are 
so natural and easy to understand attests to 
the fact that people can readily comprehend 
novel metaphors. 

Lakoff and Johnson (1980; also Lakoff & 
Turner, 1989) argued that much of human 
experience, especially its abstract aspects, 
is grasped in terms of broad conceptual 
metaphors (e.g., events occurring in time 
are understood by analogy to objects mov- 
ing in space). Time, for example, is under- 
stood in terms of objects in motion through 
space as in expressions such as “My birth- 
day is fast approaching” and “The time for 
action has arrived.” (See Boroditsky, 2000, 
for evidence of how temporal metaphors in- 
fluence cognitive judgments.) As Lakoff and 
Turner (1989) pointed out, the course of a 
life is understood in terms of time in the solar 
year (youth is springtime; old age is winter). 
Life is also conventionally conceptualized as 
a journey. Such conventional metaphors can 
still be used in creative ways, as illustrated 
by Robert Frost’s famous poem, “The Road 
Not Taken”: 


Two roads diverged in a wood, and I — 
I took the one less traveled by, 
And that has made all the difference. 


According to Lakoff and Turner, compre- 
hension of this passage depends on our im- 
plicit knowledge of the metaphor that life 
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derstanding several interrelated correspon- 
dences (e.g., person is a traveler, purposes 
are destinations, actions are routes, diffi- 
culties in life are impediments to travel, 
counselors are guides, and progress is the 
distance traveled). 

Psychological research has focused on 
demonstrations that metaphors are in- 
tegral to everyday language understand- 
ing (Glucksberg, Gildea, & Bookin, 1982; 
Keysar, 1989) and debate about whether 
metaphor is better conceptualized as a kind 
of analogy (Wolff & Gentner, 2000) or a 
kind of categorization (Glucksberg & Keysar, 
1990; Glucksberg, McClone, & Manfredi, 
1997). A likely resolution is that novel 
metaphors are interpreted by much the same 
process as analogies, whereas more conven- 
tional metaphors are interpreted as more 
general schemas (Gentner, Bowdle, Wolff, & 
Boronat, 2001). 


Knowledge Representation 


The most important influence on analogy 
research in the cognitive science tradition 
has been concerned with the representa- 
tion of knowledge within computational sys- 
tems. Many seminal ideas were developed 
by the philosopher Mary Hesse (1966), who 
was in turn influenced by Aristotle’s dis- 
cussions of analogy in scientific classifica- 
tion and Black’s (1962) interactionist view 
of metaphor. Hesse placed great stress on 
the purpose of analogy as a tool for scien- 
tific discovery and conceptual change and on 
the close connections between causal rela- 
tions and analogical mapping. In the 1970s, 
work in artificial intelligence and psychol- 
ogy focused on the representation of com- 
plex knowledge of the sort used in scientific 
reasoning, problem solving, story compre- 
hension, and other tasks that require struc- 
tured knowledge. A key aspect of structured 
knowledge is that elements can be flexibly 
bound into the roles of relations. For exam- 
ple, “dog bit man” and “man bit dog” have the 
same elements and the same relation, but 
the role bindings have been reversed, radi- 
cally altering the meaning. How the mind 


central problem to be solved by any psycho- 
logical theory of structured knowledge, in- 
cluding any theory of analogy (see Doumas 
& Hummel, Chap. 4). 

In the 1980s, a number of cognitive sci- 
entists recognized the centrality of analogy 
as a tool for discovery and its close connec- 
tion with theories of knowledge represen- 
tation. Winston (1980), guided by Minsky’s 
(1975) treatment of knowledge representa- 
tion, built a computer model of analogy that 
highlighted the importance of causal rela- 
tions in guiding analogical inference. Other 
researchers in artificial intelligence also be- 
gan to consider the use of complex analogies 
in reasoning and learning (Kolodner, 1983; 
Schank, 1982), leading to an approach to ar- 
tificial intelligence termed case-based reason- 
ing (see Kolodner, 1993). 

Around 1980, two research projects in 
psychology began to consider analogy in 
relation to knowledge representation and 
eventually integrate computational model- 
ing with detailed experimental studies of 
human analogical reasoning. Gentner (1982, 
1983; Gentner & Gentner, 1953) began 
working on mental models and analogy in 
science. She emphasized that in analogy, 
the key similarities lie in the relations that 
hold within the domains (e.g., the flow of 
electrons in an electrical circuit is analog- 
ically similar to the flow of people in a 
crowded subway tunnel), rather than in fea- 
tures of individual objects (e.g., electrons 
do not resemble people). Moreover, analog- 
ical similarities often depend on higher-order 
relations — relations between relations. For ex- 
ample, adding a resistor to a circuit causes a 
decrease in flow of electricity, just as adding a 
narrow gate in the subway tunnel would de- 
crease the rate at which people pass through 
(where causes is a higher-order relation). In 
her structure-mapping theory, Gentner pro- 
posed that analogy entails finding a struc- 
tural alignment, or mapping, between do- 
mains. In this theory, alignment between two 
representational structures is characterized 
by structural parallelism (consistent, one- 
to-one correspondences between mapped 
elements) and systematicity — an implicit 
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of relations governed by higher-order rela- 
tions, such as causal, mathematical, or func- 
tional relations. 

Holyoak (1985; Gick & Holyoak, 1980, 
1983; Holyoak & Koh, 1987) focused on the 
role of analogy in problem solving with a 
strong concern for the role of pragmatics in 
analogy — that is, how causal relations that 
impact current goals and context guide the 
interpretation of an analogy. Holyoak and 
Thagard (1989a, 1995) developed an ap- 
proach to analogy in which several factors 
were viewed as jointly constraining analogi- 
cal reasoning. According to their multicon- 
straint theory, people tend to find mappings 
that maximize similarity of corresponding el- 
ements and relations, structural parallelism 
(i.e., isomorphism, defined by consistent, 
one-to-one correspondences), and  prag- 
matic factors such as the importance of el- 
ements and relations for achieving a goal. 
Gick and Holyoak (1983) provided evidence 
that analogy can furnish the seed for form- 
ing new relational categories by abstracting 
the relational correspondences between ex- 
amples into a schema for a class of problems. 
Analogy was viewed as a central part of hu- 
man induction (Holland, Holyoak, Nisbett, 
& Thagard, 1986; see Sloman & Lagnado, 
Chap. 5) with close ties to other basic 
thinking processes, including causal infer- 
ence (see Buehner & Cheng, Chap. 7), cate- 
gorization (see Medin & Rips, Chap. 3), de- 
ductive reasoning (see Evans, Chap. 8), 
and problem solving (see Novick & Bassok, 
Chap. 14). 


Analogical Reasoning: Overview 
of Phenomena 


This section provides an overview of the 
major phenomena involving analogical rea- 
soning that have been established by em- 
pirical investigations. This review is orga- 
nized around the major components of 
analogy depicted in Figure 6.1. These com- 
ponents are inherently interrelated, so the 
connections among them are also discussed. 


¢y@heretrieval and mapping components are 


first considered followed by inference and 
relational generalization. 


Retrieval and Mapping 


A PARADIGM FOR INVESTIGATING 
ANALOGICAL TRANSFER 


Gick and Holyoak (1980, 1983) introduced 
a general laboratory paradigm for investigat- 
ing analogical transfer in the context of prob- 
lem solving. The general approach was first 
to provide people with a source analog in 
the guise of some incidental context, such 
as an experiment on “story memory.” Later, 
participants were asked to solve a problem 
that was in fact analogous to the story they 
had studied earlier. The questions of cen- 
tral interest were (1) whether people would 
spontaneously notice the relevance of the 
source analog and use it to solve the target 
problem, and (2) whether they could solve 
the analogy once they were cued to consider 
the source. Spontaneous transfer of the anal- 
ogous solution implies successful retrieval 
and mapping; cued transfer implies success- 
ful mapping once the need to retrieve the 
source has been removed. 

The source analog used by Gick and 
Holyoak (i980) was a story about a general 
who is trying to capture a fortress controlled 
by a dictator and needs to get his army to 
the fortress at full strength. Because the en- 
tire army could not pass safely along any sin- 
gle road, the general sends his men in small 
groups down several roads simultaneously. 
Arriving at the same time, the groups join 
together and capture the fortress. 

A few minutes after reading this story 
under instructions to read and remember it 
(along with two other irrelevant stories), par- 
ticipants were asked to solve a tumor prob- 
lem (Duncker, 1945), in which a doctor has 
to figure out how to use rays to destroy a 
stomach tumor without injuring the patient 
in the process. The crux of the problem is 
that it seems that the rays will have the same 
effect on the healthy tissue as on the tumor — 
high intensity will destroy both, whereas low 
intensity will destroy neither. The key issue 
is to determine how the rays can be made to 
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the surrounding tissue. The source analog, 
if it can be retrieved and mapped, can be 
used to generate a “convergence” solution to 
the tumor problem, one that parallels the 
general’s military strategy: Instead of using 
a single high-intensity ray, the doctor could 
administer several low-intensity rays at once 
from different directions. In that way, each 
ray would be at low intensity along its path, 
and hence, harmless to the healthy tissue, 
but the effects of the rays would sum to 
achieve the effect of a high-intensity ray at 
their focal point, the site of the tumor. 

When Gick and Holyoak (1980) asked 
college students to solve the tumor problem, 
without a source analog, only about 10% of 
them produced the convergence solution. 
When the general story had been studied, 
but no hint to use it was given, only about 
20% of participants produced the conver 
gence solution. In contrast, when the same 
participants were then given a simple hint 
that “you may find one of the stories you read 
earlier to be helpful in solving the problem,” 
about 75 % succeeded in generating the anal- 
ogous convergence solution. In other words, 
people often fail to notice superficially 
dissimilar source analogs that they could 
readily use. 

This gap between the difficulty of re- 
trieving remote analogs and the relative 
ease of mapping them has been replicated 
many times, both with adults (Gentner, 
Rattermann, & Forbus, 1993; Holyoak & 
Koh, 1987; Spencer & Weisberg, 1986) 
and with young children (Chen, 1996; 
Holyoak, Junn, & Billman, 1984; Tunteler 
& Resing, 2002). When analogs must 
be cued from long-term memory, cases 
from a domain similar to that of the 
cue are retrieved much more readily than 
cases from remote domains (Keane, 1987; 
Seifert, McKoon, Abelson, & Ratcliff, 1986). 
For example, Keane (1987) measured re- 
trieval of a convergence analog to the tu- 
mor problem when the source analog was 
studied 1 to 3 days prior to presentation of 
the target radiation problem. Keane found 
that 88% of participants retrieved a source 
analog from the same domain (a story about 


only 12% retrieved a source from a remote 
domain (the general story). This difference 
in ease of access was dissociable from the 
ease of postaccess mapping and transfer be- 
cause the frequency of generating the con- 
vergence solution to the radiation prob- 
lem once the source analog was cued was 
high and equal (about 86%), regardless of 
whether the source analog was from the 
same or a different domain. 


DIFFERENTIAL IMPACT OF SIMILARITY AND 
STRUCTURE ON RETRIEVAL VERSUS MAPPING 
The main empirical generalization concern- 
ing retrieval and mapping is that similar- 
ity of individual concepts in the analogs 
has a relatively greater impact on retrieval, 
whereas mapping is relatively more sensi- 
tive to relational correspondences (Gentner 
et al., 1993; Holyoak & Koh, 1987; Ross, 
1987, 1989). However, this dissociation is 
not absolute. Watching the movie West Side 
Story for the first time is likely to trigger a re- 
minding of Shakespeare’s Romeo and Juliet 
despite the displacement of the characters 
in the two works over centuries and conti- 
nents. The two stories both involve young 
lovers who suffer because of the disapproval 
of their respective social groups, causing a 
false report of death, which in turn leads 
to tragedy. It is these structural parallels be- 
tween the two stories that make them anal- 
ogous rather than simply that both stories 
involve a young man and woman, a disap- 
proval, a false report, and a tragedy. 
Experimental work on story reminding 
confirms the importance of structure, as well 
as similarity of concepts, in retrieving analogs 
from memory. Wharton and his colleagues 
(Wharton et al., 1994; Wharton, Holyoak, 
& Lange, 1996) performed a series of exper- 
iments in which college students tried to find 
connections between stories that overlapped 
in various ways in terms of the actors and ac- 
tions and the underlying themes. In a typical 
experiment, the students first studied about 
a dozen “target” stories presented in the guise 
of a study of story understanding. For exam- 
ple, one target story exemplified a theme of- 
ten called “sour grapes” after one of Aesop’s 
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tagonist tries to achieve a goal, fails, and then 
retroactively decides the goal had not really 
been desirable after all. More specifically, the 
actions involved someone trying unsuccess- 
fully to get accepted to an Ivy League col- 
lege. After a delay, the students read a set 
of different cue stories and were asked to 
write down any story or stories from the 
first session of which they were reminded. 
Some stories (far analogs) exemplified the 
same theme, but with very different char- 
acters and actions (e.g., a “sour grapes” fairy 
tale about a unicorn who tries to cross a river 
but is forced to turn back). Other stories 
were far “disanalogs” formed by reorganizing 
the characters and actions to represent a dis- 
tinctly different theme (eg., “self-doubt” — 
the failure to achieve a goal leads the pro- 
tagonist to doubt his or her own ability or 
merit). Thus, neither type of cue was simi- 
lar to the target story in terms of individual 
elements (characters and actions); however, 
the far analog maintained structural corre- 
spondences of higher-order causal relations 
with the target story, whereas the far disana- 
log did not. 

Besides varying the relation between the 
cue and target stories, Wharton et al. (1994) 
also varied the number of target stories that 
were in some way related to a single cue. 
When only one target story in a set had been 
studied (“singleton” condition), the proba- 
bility of reminding was about equal, regard- 
less of whether the cue was analogous to the 
target. However, when two target stories had 
been studied (e.g., both “sour grapes” and 
“self-doubt,” forming a “competition” condi- 
tion), the analogous target was more likely to 
be retrieved than the disanalogous one. The 
advantage of the far analog in the competi- 
tion condition was maintained even when a 
week intervened between initial study of the 
target stories and presentation of the cue sto- 
ries (Wharton et al., 1996). 

These results demonstrate that structure 
does influence analogical retrieval, but its 
impact is much more evident when multi- 
ple memory traces, each somewhat similar 
to the cue, must compete to be retrieved. 
Such retrieval competition is likely typical 


idence indicates that having people generate 
case examples, as opposed to simply asking 
them to remember cases presented earlier, 
enhances structure-based access to source 
analogs (Blanchette & Dunbar, 2000). 


THE “RELATIONAL SHIFT” IN DEVELOPMENT 


Retrieval is thus sensitive to structure and 
direct similarity of concepts. Conversely, 
mapping is sensitive to direct similarity and 
structure (eg., Reed, 1987; Ross, 1989). 
Young children are particularly sensitive 
to direct similarity of objects; when asked 
to identify corresponding elements in two 
analogs, their mappings are dominated by 
object similarity when semantic and struc- 
tural constraints conflict (Gentner & Toupin, 
1986). Younger children are particularly 
likely to map on the basis of object simi- 
larity when the relational response requires 
integration of multiple relations, and hence, 
is more dependent on working memory re- 
sources (Richland, Morrison, & Holyoak, 
2004). The developmental transition to- 
ward greater reliance on structure in map- 
ping has been termed the “relational shift” 
(Gentner & Rattermann, 1991). Greater sen- 
sitivity to relations with age appears to arise 
owing to a combination of incremental ac- 
cretion of knowledge about relational con- 
cepts and stage-like increments in working 
memory capacity (Halford, 1993; Halford 
& Wilson, 1980). (For reviews of develop- 
mental research on analogy, see Goswami, 
1992, 2001; Halford, Chap. 22 ; Holyoak & 
Thagard, 1995). 


GOAL-DIRECTED MAPPING 


Mapping is guided not only by relational 
structure and element similarity but also by 
the goals of the analogist (Holyoak, 1985). 
People draw analogies not to find a pris- 
tine isomorphism for its own sake but to 
make plausible inferences that will achieve 
their goals. Particularly when the mapping 
is inherently ambiguous, the constraint of 
pragmatic centrality — relevance to goals — 
is critical (Holyoak, 1985). Spellman and 
Holyoak (i996) investigated the impact of 
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ated for inherently ambiguous analogies. In 
one experiment, college students read two 
science fiction stories about countries on 
two planets. These countries were interre- 
lated by various economic and military al- 
liances. Participants first made judgments 
about individual countries based on either 
economic or military relationships and were 
then asked mapping questions about which 
countries on one planet corresponded to 
which on the other. Schematically, planet 1 
included three countries, such that “Afflu” 
was economically richer than “Barebrute,” 
whereas the latter was militarily stronger 
than “Compak.” Planet 2 included four 
countries, with “Grainwell” being richer than 
“Hungerall” and “Millpower” being stronger 
than “Mightless.” The critical aspect of this 
analogy problem is that Barebrute (planet 1) 
is both economically weak (like Hunger 
all on planet 2) and militarily strong (like 
Millpower) and therefore, has two compet- 
ing mappings that are equally supported by 
structural and similarity constraints. 

Spellman and Holyoak (1996) found that 
participants whose processing goal led them 
to focus on economic relationships tended 
to map Barebrute to Hungerall rather than 
Millpower, whereas those whose process- 
ing goal led them to focus on military 
relationships had the opposite preferred 
mapping. The variation in pragmatic cen- 
trality of the information thus served to 
decide between the competing mappings. 
One interpretation of such findings is that 
pragmatically central propositions tend to 
be considered earlier and more often than 
those that are less goal relevant and hence, 
dominate the mapping process (Hummel & 
Holyoak, 1997). 


COHERENCE IN ANALOGICAL MAPPING 


The key idea of Holyoak and Thagard’s 
(1989a) multiconstraint theory of analogy is 
that several different kinds of constraints — 
similarity, structure, and purpose — all in- 
teract to determine the optimal set of cor- 
respondences between source and target. A 
good analogy is one that appears coherent in 


on a solution that satisfies as many differ- 
ent constraints as possible (Thagard, 2000). 
Everyday use of analogies depends on the 
human ability to find coherent mappings — 
even when source and target are complex 
and the mappings are ambiguous. For ex- 
ample, political debate often makes use of 
analogies between prior situations and some 
current controversy (Blanchette & Dunbar, 
2001, 2002). Ever since World War II, politi- 
cians in the United States and elsewhere 
have periodically argued that some military 
intervention was justified because the cur- 
rent situation was analogous to that lead- 
ing to World War II. A commonsensical 
mental representation of World War II, the 
source analog, amounts to a story figuring 
an evil villain, Hitler; misguided appeasers, 
such as Neville Chamberlain; and clear- 
sighted heroes, such as Winston Churchill 
and Franklin Delano Roosevelt. The coun- 
tries involved in World War II included the 
villains, Germany and Japan; the victims, 
such as Austria, Czechoslovakia, and Poland; 
and the heroic defenders, notably Britain and 
the United States. 

A series of American presidents have used 
the World War II analog as part of their 
argument for American military interven- 
tion abroad (see Khong, 1992). These in- 
clude Harry Truman (Korea, 1950), Lyndon 
Johnson (Vietnam, 1965), George Bush se- 
nior (Kuwait and Iraq, 1991), and his son 
George W. Bush (Iraq, 2003). Analogies to 
World War II have also been used to sup- 
port less aggressive responses. Most notably, 
during the Cuban missile crisis of 1962, 
President John F. Kennedy decided against 
a surprise attack on Cuba in part because he 
did not want the United States to behave in 
a way that could be equated to Japan’s sur- 
prise attack on Pearl Harbor. 

The World War II situation was, of course, 
very complex and is never likely to map per- 
fectly onto any new foreign policy problem. 
Nonetheless, by selectively focusing on goal- 
relevant aspects of the source and target and 
using multiple constraints in combination, 
people can often find coherent mappings in 
situations of this sort. After the Iraqi invasion 
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H. W. Bush argued that Saddam Hussein, the 
Iraqi leader, was analogous to Adolf Hitler 
and that the Persian Gulf crisis in general was 
analogous to events that had led to World 
War II a half-century earlier. By drawing the 
analogy between Hussein and Hitler, Pres- 
ident Bush encouraged a reasoning process 
that led to the construction of a coherent 
system of roles for the players in the Gulf sit- 
uation. The popular understanding of World 
War II provided the source, and analogical 
mapping imposed a set of roles on the tar- 
get Gulf situation by selectively emphasizing 
the most salient relational parallels between 
the two situations. Once the analogical cor 
respondences were established (with Iraq 
identified as an expansionist dictatorship like 
Germany, Kuwait as its first victim, Saudi 
Arabia as the next potential victim, and the 
United States as the main defender of the 
Gulf states), the clear analogical inference 
was that both self-interest and moral con- 
siderations required immediate military in- 
tervention by the United States. Aspects of 
the Persian Gulf situation that did not map 
well to World War II (e.g., lack of democracy 
in Kuwait) were pushed to the background. 

Of course, the analogy between the two 
situations was by no means perfect. Simi- 
larity at the object level favored mapping 
the United States of 1991 to the United 
States of World War II simply because it 
was the same country, which would in turn 
support mapping Bush to President Roo- 
sevelt. However, the United States did not 
enter World War II until it was bombed 
by Japan, well after Hitler had marched 
through much of Europe. One might there- 
fore argue that the United States of 1991 
mapped to Great Britain of World War II and 
that Bush mapped to Winston Churchill, the 
British Prime Minister (because Bush, sim- 
ilar to Churchill, led his nation and West- 
ern allies in early opposition to aggression). 
These conflicting pressures made the map- 
pings ambiguous. However, the pressure to 
maintain structural consistency implies that 
people who mapped the United States to 
Britain should also tend to map Bush to 
Churchill, whereas those who mapped the 


instead map Bush to Roosevelt. 

During the first 2 days of the U.S.-led 
counterattack against the Iraqi invasion of 
Kuwait, Spellman and Holyoak (1992) asked 
a group of American undergraduates a few 
questions to find out how they interpreted 
the analogy between the then-current situ- 
ation in the Persian Gulf and World War II. 
The undergraduates were asked to sup- 
pose that Saddam Hussein was analogous 
to Hitler. Regardless of whether they be- 
lieved the analogy was appropriate, they 
were then asked to write down the most 
natural match in the World War II situation 
for Iraq, the United States, Kuwait, Saudi 
Arabia, and George Bush. For those stu- 
dents who gave evidence that they knew the 
basic facts about World War II, the major- 
ity produced mappings that fell into one of 
two patterns. Those students who mapped 
the United States to itself also mapped 
Bush to Roosevelt; these same students also 
tended to map Saudi Arabia to Great Britain. 
Other students, in contrast, mapped the 
United States to Great Britain and Bush to 
Churchill, which in turn (so as to maintain 
one-to-one correspondences) forced Saudi 
Arabia to map to some country other than 
Britain. The mapping for Kuwait (which did 
not depend on the choice of mappings for 
Bush, the United States, or Saudi Arabia) 
was usually to one or two of the early vic- 
tims of Germany in World War II (usually 
Austria or Poland). 

The analogy between the Persian Gulf sit- 
uation and World War II thus generated a 
“bistable” mapping: People tended to pro- 
vide mappings based on either of two coher- 
ent but mutually incompatible sets of corre- 
spondences. Spellman and Holyoak (i992) 
went on to perform a second study, using a 
different group of undergraduates, to show 
that people’s preferred mappings could be 
pushed around by manipulating their knowl- 
edge of the source analog, World War II. 
Because many undergraduates were lacking 
in knowledge about the major participants 
and events in World War II, it proved pos- 
sible to “guide” them to one or the other 
mapping pattern by having them first read a 
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War II. The various summaries were all his- 
torically “correct,” in the sense of providing 
only information taken directly from history 
books, but each contained slightly differ- 
ent information and emphasized different 
points. Each summary began with an iden- 
tical passage about Hitler’s acquisition of 
Austria, Czechoslovakia, and Poland and the 
efforts by Britain and France to stop him. 
The versions then diverged. Some versions 
went on to emphasize the personal role of 
Churchill and the national role of Britain; 
other versions placed greater emphasis on 
what Roosevelt and the United States did 
to further the war effort. After reading one 
of these summaries of World War II, the un- 
dergraduates were asked the same mapping 
questions as had been used in the previous 
study. The same bistable mapping patterns 
emerged as before, but this time the sum- 
maries influenced which of the two coher 
ent patterns of responses students tended 
to give. People who read a “Churchill” ver- 
sion tended to map Bush to Churchill and 
the United States to Great Britain, whereas 
those who read a “Roosevelt” version tended 
to map Bush to Roosevelt and the United 
States to the United States. It thus ap- 
pears that even when an analogy is messy 
and ambiguous, the constraints on analog- 
ical coherence produce predictable inter- 
pretations of how the source and target 
fit together. 

Achieving analogical coherence in map- 
ping does not, of course, guarantee that the 
source will provide a clear and compelling 
basis for planning a course of action to deal 
with the target situation. In 1991, President 
Bush considered Hussein enough of a Hitler 
to justify intervention in Kuwait but not 
enough of one to warrant his removal from 
power in Iraq. A decade later his son, Presi- 
dent George W. Bush, reinvoked the World 
War II analogy to justify a preemptive inva- 
sion of Iraq itself. Bush claimed (falsely, as 
was later revealed) that Hussein was acquir- 
ing biological and perhaps nuclear weapons 
that posed an imminent threat to the United 
States and its allies. Historical analogies can 
be used to obfuscate as well as to illuminate. 


Relational 
Match 


Featural \ 


Figure 6.3. An example of a pair of pictures 
used in studies of analogical mapping with 
arrows added to indicate featural and relational 
responses. (From Tohill & Holyoak, 2000, p. 31. 
Reprinted by permission.) 


WORKING MEMORY IN ANALOGICAL MAPPING 


Analogical reasoning, because it depends on 
manipulating structured representations of 
knowledge, would be expected to make crit- 
ical use of working memory. The role of 
working memory in analogy has been ex- 
plored using a picture-mapping paradigm in- 
troduced by Markman and Gentner (1993). 
An example of stimuli similar to those they 
used is shown in Figure 6.3. In their exper- 
iments, college students were asked to ex- 
amine the two pictures and then decide (for 
this hypothetical example) what object in 
the bottom picture best goes with the man 
in the top picture. When this single map- 
ping is considered in isolation, people often 
indicate that the boy in the bottom picture 
goes with the man in the top picture based 
on perceptual and semantic similarity of 
these elements. However, when people are 
asked to match not just one object but three 
(e.g., the man, dog, and the tree in the top 
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they are led to build an integrated represen- 
tation of the relations among the objects and 
of higher-order relations between relations. 
In the top picture, a man is unsuccessfully 
trying to restrain a dog, which then chases 
the cat. In the bottom picture, the tree is un- 
successful in restraining the dog, which then 
chases the boy. Based on these multiple in- 
teracting relations, the preferred match to 
the man in the top picture is not the boy in 
the lower scene but the tree. Consequently, 
people who map three objects at once are 
more likely to map the man to the tree on 
the basis of their similar relational roles than 
are people who map the man alone. 

Whereas Markman and Gentner (1993) 
showed that the number of objects to be 
mapped influences the balance between 
the impact of element similarity versus re- 
lational structure, other studies using the 
picture-mapping paradigm have demon- 
strated that manipulations that constrict 
working memory resources have a similar 
impact. Waltz, Lau, Grewal, and Holyoak 
(2000) asked college students to map pic- 
tures while performing a secondary task 
designed to tax working memory (e.g., gen- 
erating random digits). Adding a dual task di- 
minished relational responses and increased 
similarity-based responses (see Morrison, 
Chap. 19). A manipulation that increases 
people’s anxiety level (performing mathe- 
matical calculations under speed pressure 
prior to the mapping task) yielded a sim- 
ilar shift in mapping responses (Tohill & 
Holyoak, 2000). Most dramatically, degen- 
eration of the frontal lobes radically impairs 
relation-based mapping (Morrison et al., 
2004). In related work using complex story 
analogs, Krawczyk, Holyoak, and Hummel 
(2004) demonstrated that mappings (and in- 
ferences) based on element similarity ver- 
sus relational structure were made about 
equally often when the element similarities 
were salient and the relational structure was 
highly complex. All these findings support 
the hypothesis that mapping on the basis of 
relations requires adequate working mem- 
ory to represent and manipulate role bind- 
ings (Hummel & Holyoak, 1997). 


COPY WITH SUBSTITUTION AND GENERATION 


Analogical inference — using a source analog 
to form a new conjecture, whether it be a 
step toward solving a math problem (Reed, 
Dempster, & Ettinger, 1985; see Novick 
& Bassok, Chap. 14), a scientific hypoth- 
esis (see Dunbar & Fugelsang, Chap. 29), 
a diagnosis for puzzling medical symptoms 
(see Patel, Arocha, & Zhang, Chap. 30), 
or a basis for deciding a legal case (see 
Ellsworth, Chap. 28) — is the fundamental 
purpose of analogical reasoning. Mapping 
serves to highlight correspondences between 
the source and target, including “alignable 
differences” (Markman & Gentner, 1993) — 
the distinct but corresponding elements of 
the two analogs. These correspondences pro- 
vide the input to an inference engine that 
generates new target propositions. The ba- 
sic form of analogical inference has been 
called “copy with substitution and genera- 
tion” (CWSG; Holyoak et al., 1994). CWSG 
involves constructing target analogs of un- 
mapped source propositions by substituting 
the corresponding target element, if known, 
for each source element, and if no corre- 
sponding target element exists, postulating 
one as needed. This procedure gives rise to 
two important corollaries concerning infer- 
ence errors. First, if critical elements are dif- 
ficult to map (e.g., because of strong repre- 
sentational asymmetries such as those that 
hinder mapping a discrete set of elements 
to a continuous variable; Bassok & Holyoak, 
1989; Bassok & Olseth, 1995), then no in- 
ferences can be constructed. Second, if ele- 
ments are mismapped, predictable inference 
errors will result (Holyoak et al., 1994; Reed, 
1987). 

All major computational models of ana- 
logical inference use some variant of CWSG 
(e.g., Falkenhainer et al., 1989; Halford et al., 
1994; Hofstadter & Mitchell, 1994; Holyoak 
et al., 1994; Hummel & Holyoak, 2003; 
Keane & Brayshaw, 1988; Kokinov & Petroy, 
2001). CWSG is critically dependent on 
variable binding and mapping; hence, mod- 
els that lack these key computational prop- 
erties (e.g., traditional connectionist models) 
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pects of analogical inference (see Doumas & 
Hummel, Chap. 4). 

Athough all analogy models use some 
form of CWSG, additional constraints 
on this inference mechanism are critical 
(Clement & Gentner, 1991; Holyoak et al., 
1994; Markman, 1997). If CWSG were 
unconstrained, then any unmapped source 
proposition would generate an inference 
about the target. Such a loose criterion for 
inference generation would lead to ram- 
pant errors whenever the source was not 
isomorphic to a subset of the target, and 
such isomorphism will virtually never hold 
for problems of realistic complexity. Sev- 
eral constraints on CWSG were demon- 
strated in a study by Lassaline (1996; also 
see Clement & Gentner, 1991; Spellman 
& Holyoak, 1996). Lassaline had college 
students read analogs describing proper 
ties of hypothetical animals and then rate 
various possible target inferences for the 
probability that the conclusion would be 
true given the information in the premise. 
Participants rated potential inferences as 
more probable when the source and tar 
get analogs shared more attributes, and 
hence, mapped more strongly. In addition, 
their ratings were sensitive to structural 
and pragmatic constraints. The presence 
of a higher-order linking relation in the 
source made an inference more credible. For 
example, if the source and target animals 
were both described as having an acute 
sense of smell, and the source animal was 
said to have a weak immune system that 
“develops before” its acute sense of smell, 
then the inference that the target animal also 
has a weak immune system would be bol- 
stered relative to stating only that the source 
animal had an acute sense of smell “and” 
a weak immune system. The benefit con- 
veyed by the higher-order relation was in- 
creased if the relation was explicitly causal 
(e.g., in the source animal, a weak immune 
system “causes” its acute sense of smell), 
rather than less clearly causal (“develops 
before”). (See Hummel & Holyoak, 2003, 
for a simulation of this and other inference 
results using a CWSG algorithm.) 


cal inferences are made and how inferences 
generated by CWSG relate to facts about the 
target analog that are stated directly. One 
extreme possibility is that people only make 
analogical inferences when instructed to do 
so and that inferences are carefully “marked” 
as such so they will never be confused with 
known facts about the target. At the other 
extreme, it is possible that some analogi- 
cal inferences are triggered when the tar 
get is first processed (given that the source 
has been activated) and that such inferences 
are then integrated with prior knowledge 
of the target. One paradigm for address- 
ing this issue is based on testing for false 
“recognition” of potential inferences in a 
subsequent memory test. The logic of the 
recognition paradigm (Bransford, Barclay, & 
Franks, 1972) is that if an inference has been 
made and integrated with the rest of the 
target analog, then later the reasoner will 
falsely believe that the inference had been 
directly presented. 

Early work by Schustack and Anderson 
(1979) provided evidence that people some- 
times falsely report that analogical infer- 
ences were actually presented as facts. 
Blanchette and Dunbar (2002) performed 
a series of experiments designed to assess 
when analogical inferences are made. They 
had college students (in Canada) read a text 
describing a current political issue, possible 
legalization of marijuana use, which served 
as the target analog. Immediately afterward, 
half the students read, “The situation with 
marijuana can be compared to...”, followed 
by an additional text describing the period 
early in the twentieth century when alco- 
hol use was prohibited. Importantly, the stu- 
dents in the analogy condition were not told 
how prohibition mapped onto the marijuana 
debate, nor were they asked to draw any in- 
ferences. After a delay (1 week in one ex- 
periment, 15 minutes in another), the stu- 
dents were given a list of sentences and were 
asked to decide whether each sentence had 
actually been presented in the text about 
marijuana use. The critical items were sen- 
tences such as “The government could set up 
agencies to control the quality and take over 
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tences had never been presented; however, 
they could be generated as analogical infer- 
ences by CWSG based on a parallel state- 
ment contained in the source analog (“The 
government set up agencies to control the 
quality and take over the distribution of al- 
cohol”). Blanchette and Dunbar found that 
students in the analogy condition said “yes” 
to analogical inferences about 50% of the 
time, whereas control subjects who had not 
read the source analog about prohibition said 
“yes” only about 25% of the time. This ten- 
dency to falsely “recognize” analogical infer- 
ences that had never been read was obtained 
both after long and short delays and with 
both familiar and less familiar materials. 

It thus appears that when people notice 
the connection between a source and target, 
and they are sufficiently engaged in an effort 
to understand the target situation, analogi- 
cal inferences will be generated by CWSG 
and then integrated with prior knowledge of 
the target. At least sometimes, an analogical 
inference becomes accepted as a stated fact. 
This result obviously has important impli- 
cations for understanding analogical reason- 
ing, such as its potential for use as a tool 
for persuasion. 


RELATIONAL GENERALIZATION 


In addition to generating local inferences 
about the target by CWSG, analogical rea- 
soning can give rise to relational general- 
izations — abstract schemas that establish 
an explicit representation of the common- 
alities between the source and the target. 
Comparison of multiple analogs can result 
in the induction of a schema, which in 
turn will facilitate subsequent transfer to 
additional analogs. The induction of such 
schemas has been demonstrated in both 
adults (Catrambone & Holyoak, 1989; Gick 
& Holyoak, 1983; Loewenstein, Thompson, 
& Gentner, 1999; Ross & Kennedy, 1990) 
and young children (Brown, Kane, & Echols, 
1986; Chen & Daehler, 1989; Holyoak et al., 
1984; Kotovsky & Gentner, 1996). People 
are able to induce schemas by comparing 
just two analogs to one another (Gick & 


schemas simply as a side effect of applying 
one solved source problem to an unsolved 
target problem (Novick & Holyoak, 1991; 
Ross & Kennedy, 1990). 

In the case of problem schemas, more 
effective schemas are formed when the 
goal-relevant relations are the focus rather 
than incidental details (Brown et al., 1986; 
Brown, Kane, & Long, 1989; Gick & 
Holyoak, 1983). In general, any kind of pro- 
cessing that helps people focus on the under- 
lying causal structure of the analogs, thereby 
encouraging learning of more effective prob- 
lem schemas, will improve subsequent trans- 
fer to new problems. For example, Gick 
and Holyoak (1983) found that induction 
of a “convergence” schema from two dis- 
parate analogs was facilitated when each 
story stated the underlying solution prin- 
ciple abstractly: “If you need a large force 
to accomplish some purpose, but are pre- 
vented from applying such a force directly, 
many smaller forces applied simultaneously 
from different directions may work just as 
well.” In some circumstances, transfer can 
also be improved by having the reasoner 
generate a problem analogous to an initial 
example (Bernardo, 2001). Other work has 
shown that abstract diagrams that highlight 
the basic idea of using multiple converging 
forces can aid in schema induction and sub- 
sequent transfer (Beveridge & Parkins, 1987; 
Gick & Holyoak, 1983) — especially when 
the diagram uses motion cues to convey per- 
ception of forces acting on a central target 
(Pedone, Hummel, & Holyoak, 2001; see 
Figure 6.4, top). 

Although two examples can suffice to es- 
tablish a useful schema, people are able to 
incrementally develop increasingly abstract 
schemas as additional examples are provided 
(Brown et al., 1986, 1989; Catrambone & 
Holyoak, 1989). However, even with mul- 
tiple examples that allow novices to start 
forming schemas, people may still fail to 
transfer the analogous solution to a prob- 
lem drawn from a different domain if a 
substantial delay intervenes or if the con- 
text is changed (Spencer & Weisberg, 1986). 
Nonetheless, as novices continue to develop 
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Figure 6.4. Sequence of diagrams used to convey the convergence schema by 
perceived motion. Top: sequence illustrating convergence (arrows appear to 
move inward in II-IV). Bottom: control sequence in which arrows diverge 
instead of converge (arrows appear to move outward in II-IV). (From Pedone, 
Holyoak, & Hummel, 2001, p. 217. Reprinted by permission.) 


more powerful schemas, long-term transfer 
in an altered context can be dramatically 
improved (Barnett & Koslowski, 2002). For 
example, Catrambone and Holyoak (1989) 
gave college students a total of three con- 
vergence analogs to study, compare, and 
solve. The students were first asked a series 
of detailed questions designed to encourage 
them to focus on the abstract structure com- 
mon to two of the analogs. After this ab- 
straction training, the students were asked 
to solve another analog from a third do- 
main (not the tumor problem), after which 
they were told the convergence solution to 
it (which most students were able to gen- 
erate themselves). Finally, 1 week later, the 
students returned to participate in a dif- 
ferent experiment. After the other experi- 
ment was completed, they were given the 
tumor problem to solve. More than 80% 
of participants came up with the converg- 
ing rays solution without any hint. As the 
novice becomes an expert, the emerging 
schema becomes increasingly accessible and 
is triggered by novel problems that share its 
structure. Deeper similarities have been con- 


structed between analogous situations that 
fit the schema. As schemas are acquired 
from examples, they in turn guide future 
mappings and inferences (Bassok, Wu, & 
Olseth, 1995). 


Computational Models of Analogy 


From its inception, work on analogy 
in relation to knowledge representation 
has involved the development of detailed 
computational models of the various com- 
ponents of analogical reasoning typically fo- 
cusing on the central process of structure 
mapping. The most influential early models 
included SME (Structure Mapping Engine; 
Falkenhainer, Forbus, & Gentner, 1989), 
ACME (Analogical Mapping by Constraint 
Satisfaction; Holyoak & Thagard, 198ga), 
IAM (Incremental Analogy Model; Keane & 
Brayshaw, 1988), and Copycat (Hofstadter 
& Mitchell, 1994). More recently, models 
of analogy have been developed based on 
knowledge representations constrained by 
neural mechanisms (Hummel & Holyoak, 
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1992). Thesd-effseateitiny 
based on the use of tensor products for vari- 
able binding, the STAR model (Structured 
Tensor Analogical Reasoning; Halford et al., 
1994; see Halford, Chap. 22), and another 
based on neural synchrony, the LISA model 
(Learning and Inference with Schemas and 
Analogies; Hummel & Holyoak, 1997, 2003; 
see Doumas & Hummel, Chap. 4). (For a 
brief overview of computational models of 
analogy, see French, 2002.) Three models are 
sketched to illustrate the general nature of 
computational approaches to analogy. 


Structure Mapping Engine (SME) 


SME (Falkenhainer et al., 1989) illustrates 
how analogical mapping can be performed 
by algorithms based on partial graph match- 
ing. The basic knowledge representation for 
the inputs is based on a notation in the style 
of predicate calculus. If one takes a simple 
example based on the World War II analogy 
as it was used by President George Bush in 
1991, a fragment might look like 


SOURCE: 

Fiihrer-of (Hitler, Germany) 

occupy (Germany, Austria) 

evil (Hitler) 

cause [evil (Hitler), occupy (Germany, 
Austria) ] 

prime-minister-of (Churchill, Great 
Britain) 

cause [occupy (Germany, Austria), coun- 
terattack (Churchill, Hitler)] 

TARGET: 

president-of (Hussein, Iraq) 

invade (Iraq, Kuwait) 

evil (Hussein) 

cause [evil (Hussein), invade (Iraq, 
Kuwait)] 

president-of (Bush, United States) 


SME distinguishes objects (role fillers, 
such as “Hitler”), attributes (one-place pred- 
icates, such as “evil” with its single role filler), 
first-order relations (multiplace predicates, 
such as “occupy” with its two role fillers), and 
higher-order relations (those such as “cause” 
that take at least one first-order relation as a 
role filler). As illustrated in Figure 6.5, the 


phe Uhthos Asari NaCy pie ticate-calculus notation is equivalent to a 


graph structure. An analogical mapping can 
then be viewed as a set of correspondences 
between partially matching graph structures. 

The heart of the SME algorithm is a pro- 
cedure for finding graph matches that sat- 
isfy certain criteria. The algorithm operates 
in three stages, progressing in a “local-to- 
global” direction. First, SME proposes lo- 
cal matches between all identical predicates 
and their associated role fillers. It is as- 
sumed similar predicates (e.g., “Fuhrer-of” 
and “president-of”; “occupy” and “invade”) 
are first franstonaed into more general pred- 
icates (e.g.,“leader-of”; “attack”) that reveal 
a hidden identity. (In practice, the program- 
mer must make the required substitutions 
so similar but nonidentical predicates can be 
matched.) The resulting matches are typi- 
cally inconsistent in that one element in the 
source may match multiple elements in the 
target (e.g., Hitler might match either Hus- 
sein or Bush because all are “leaders”). Sec- 
ond, the resulting local matches are inte- 
grated into structurally consistent clusters or 
“kernels” (e.g., the possible match between 
Hitler and Bush is consistent with that be- 
tween Germany and the United States, and 
so these matches would form part of a sin- 
gle kernel). Third, the kernels are merged 
into a small number of sets that are max- 
imal in size (i.e, that include matches be- 
tween the greatest number of nodes in the 
two graphs), while maintaining correspon- 
dences that are structurally consistent and 
one to one. SME then ranks the result- 
ing sets of mappings by a structural eval- 
uation metric that favors “deep” mappings 
(ones that include correspondences between 
higher-order relations). For our example, 
the optimal set will respectively map Hitler, 
Germany, Churchill, and Great Britain to 
Hussein, Iraq, Bush, and the United States 
because of the support provided by the map- 
ping between the higher-order “cause” rela- 
tions involving “occupy/invade.” Using this 
optimal mapping, SME applies a CWSG al- 
gorithm to generate inferences about the 
target based on unmapped propositions in 
the source. Here, the final “cause” relation 
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Figure 6.5. SME’s graphical representation of a source and target analog. 


in the source will yield the analogical infer- 
ence, cause [attack (Iraq, Kuwait), counter- 
attack (Bush, Hussein)]. 

SME thus models the mapping and in- 
ference components of analogical reason- 
ing. A companion model, MACFAC (“Many 
Are Called but Few Are Chosen”; Forbus, 
Gentner, & Law, 1995) deals with the ini- 
tial retrieval of a source analog from long- 
term memory. MACFAC has an initial stage 
(“many are called”) in which analogs are rep- 
resented by content vectors, which code the 
relative number of occurrences of a partic- 


ular predicate in the corresponding struc- 
tured representation. (Content vectors are 
computed automatically from the underly- 
ing structural representations. ) The content 
vector for the target is then matched to vec- 
tors for all analogs stored in memory, and 
the dot product for each analog pair is cal- 
culated as an index of similarity. The source 
analog with the highest dot product, plus 
other stored analogs with relatively high dot 
products, are marked as retrieved. In its sec- 
ond stage, MACFAC uses SME to assess 
the degree of the structural overlap between 
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the program to identify a smaller number of 
potential sources that have the highest de- 
grees of structural parallelism with the target 
(“few are chosen”). As the content vectors 
used in the first stage of MACFAC do not 
code role bindings, the model provides a 
qualitative account of why the retrieval stage 
of analogy is less sensitive to structure than 
is the mapping stage. 


Analogical Mapping by Constraint 
Satisfaction (ACME) 


The ACME model (Holyoak, Novick, & 
Melz, 1994; Holyoak & Thagard, 1989a) was 
directly influenced by connectionist mod- 
els based on parallel constraint satisfac- 
tion (Rumelhart, Smolensky, McClelland, 
& Hinton, 1986; see Doumas & Hummel, 
Chap. 4). ACME takes as input symbolic 
representations of the source and target 
analogs in essentially the same form as those 
used in SME. However, whereas SME fo- 
cuses on structural constraints, ACME in- 
stantiates a multiconstraint theory in which 
structural, semantic, and pragmatic con- 
straints interact to determine the optimal 
mapping. ACME accepts a numeric code 
for degree of similarity between predicates, 
which it uses as a constraint on mapping. 
Thus, ACME, unlike SME, can match simi- 
lar predicates (e.g., “occupy” and “invade”) 
without explicitly recoding them as iden- 
tical. In addition, ACME accepts a nu- 
meric code for the pragmatic importance 
of a possible mapping, which is also used 
as a constraint. 

ACME is based on a constraint satis- 
faction algorithm, which proceeds in three 
steps. First, a connectionist “mapping net- 
work” is constructed in which the units rep- 
resent hypotheses about possible element 
mappings and the links represent specific in- 
stantiations of the general constraints (Fig- 
ure 6.6). Second, an interactive-activation 
algorithm operates to “settle” the map- 
ping network in order to identify the set 
of correspondences that collectively repre- 
sent the “optimal” mapping between the 
analogs. Any constraint may be locally vio- 


Third, if the model is being used to gener- 
ate inferences and correspondences, CWSG 
is applied to generate inferences based 
on the correspondences identified in the 
second step. 

ACME has a companion model, ARCS 
(Analog Retrieval by Constraint Satis- 
faction; Thagard, Holyoak, Nelson, & 
Gochfeld, 1990) that models analog re- 
trieval. Analogs in long-term memory are 
connected within a semantic network (see 
Medin & Rips, Chap. 3); this network of 
concepts provides the initial basis by which 
a target analog activates potential source 
analogs. Those analogs in memory that are 
identified as having semantic links to the tar- 
get (i.e., those that share similar concepts) 
then participate in an ACME-like con- 
straint satisfaction process to select the opti- 
mal source. The constraint network formed 
by ARCS is restricted to those concepts 
in each analog that have semantic links; 
hence, ARCS shows less sensitivity to struc- 
ture in retrieval than does ACME in map- 
ping. Because constraint satisfaction algo- 
rithms are inherently competitive, ARCS 
can model the finding that analogical ac- 
cess is more sensitive to structure when sim- 
ilar source analogs in long-term memory 
compete to be retrieved (Wharton et al., 


1994, 1996). 


Learning and Inference with Schemas 
and Analogies (LISA) 


Similar to ACME, the LISA model 
(Hummel & Holyoak, 1997, 2003) is 
based on the principles of the multicon- 
straint theory of analogy; unlike ACME, 
LISA operates within psychologically and 
neurally realistic constraints on working 
memory (see Doumas & Hummel, Chap. 4; 
Morrison, Chap. 19). The models discussed 
previously include at most localist rep- 
resentations of the meaning of concepts 
(e.g., a semantic network in the case of 
ARCS), and most of their processing is 
performed on propositional representations 
unaccompanied by any more detailed level 
of conceptual representation (e.g., neither 
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Figure 6.6. A constraint-satisfaction network in ACME. 


ACME nor SME includes any represen- 
tation of the meaning of concepts). LISA 
also goes beyond previous models in that 
it provides a unified account of all the 
major components of analogical reasoning 
(retrieval, mapping, inference, and _ re- 
lational generalization). 

LISA represents propositions using a hi- 
erarchy of distributed and localist units (see 
Figure 4.1in Doumas & Hummel, Chap. 4). 
LISA includes both a long-term memory 
for propositions and concept meanings and 
a limited-capacity working memory. LISA’s 
working memory representation, which uses 
neural synchrony to encode role-filler bind- 
ings, provides a natural account of the ca- 
pacity limits of working memory because it 
is only possible to have a finite number of 
bindings simultaneously active and mutually 
out of synchrony. 

Analog retrieval is accomplished as a form 
of guided pattern matching. Propositions in a 
target analog generate synchronized patterns 


of activation on the semantic units, which in 
turn activate propositions in potential source 
analogs residing in long-term memory. The 
resulting coactivity of source and target el- 
ements, augmented with a capacity to learn 
which structures in the target were coactive 
with which in the source, serves as the basis 
for analogical mapping. LISA includes a set 
of mapping connections between units of the 
same type (e.g., object, predicate) in sepa- 
rate analogs. These connections grow when- 
ever the corresponding units are active si- 
multaneously and thereby permit LISA to 
learn the correspondences between struc- 
tures in separate analogs. They also permit 
correspondences learned early in mapping to 
influence the correspondences learned later. 
Augmented with a simple algorithm for self- 
supervised learning, the mapping algorithm 
serves as the basis for analogical inference 
by CWSG. Finally, augmented with a sim- 
ple algorithm for intersection discovery, self- 
supervised relational learning serves as the 
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used to simulate a wide range of data on 
analogical reasoning (Hummel & Holyoak, 
1997, 2003), including both behavioral 
and neuropsychological studies (Morrison 
et al., 2004). 


Conclusions and Future Directions 


When we think analogically, we do much 
more than just compare two analogs based 
on obvious similarities between their el- 
ements. Rather, analogical reasoning is a 
complex process of retrieving structured 
knowledge from long-term memory, repre- 
senting and manipulating role-filler bind- 
ings in working memory, performing self- 
supervised learning to form new inferences, 
and finding structured intersections between 
analogs to form new abstract schemas. The 
entire process is governed by the core con- 
straints provided by isomorphism, similarity 
of elements, and the goals of the reasoner 
(Holyoak & Thagard, 198 9a). These con- 
straints apply in all components of analog- 
ical reasoning: retrieval, mapping, inference, 
and relational generalization. When analogs 
are retrieved from memory, the constraint of 
element similarity plays a large role, but rela- 
tional structure is also important — especially 
when multiple source analogs similar to the 
target are competing to be selected. For 
mapping, structure is the most important 
constraint but requires adequate working 
memory resources; similarity and purpose 
also contribute. The success of analogical in- 
ference ultimately depends on whether the 
purpose of the analogy is achieved, but satis- 
fying this constraint is intimately connected 
with the structural relations between the 
analogs. Finally, relational generalization oc- 
curs when schemas are formed from the 
source and target to capture those structural 
patterns in the analogs that are most rele- 
vant to the reasoner’s purpose in exploiting 
the analogy. 

Several current research directions are 
likely to continue to develop. Computa- 
tional models of analogy, such as LISA 
(Hummel & Holyoak, 1997, 2003), have 


ogy with research in cognitive neuroscience 
(Morrison et al., 2004). We already have 
some knowledge of the general neural cir- 
cuits that underlie analogy and other forms 
of reasoning (see Goel, Chap. 20). As 
more sophisticated noninvasive neuroimag- 
ing methodologies are developed, it should 
become possible to test detailed hypothe- 
ses about the neural mechanisms underly- 
ing analogy, such as those based on temporal 
properties of neural systems. 

Most research and modeling in the field 
of analogy has emphasized quasilinguistic 
knowledge representations, but there is good 
reason to believe that reasoning in general 
has close connections to perception (e.g., 
Pedone et al., 2001). Perception provides 
an important starting point for grounding at 
least some “higher” cognitive representations 
(Barsalou, 1999). Some progress has been 
made in integrating analogy with perception. 
For example, the LISA model has been aug- 
mented with a Metric Array Module (MAM; 
Hummel & Holyoak, 2001), which provides 
specialized processing of metric information 
at a level of abstraction applicable to both 
perception and quasispatial concepts. How- 
ever, models of analogy have generally failed 
to address evidence that the difficulty of 
solving problems and transferring solution 
methods to isomorphic problems is depen- 
dent on the difficulty of perceptually encod- 
ing key relations. The ease of solving appar- 
ently isomorphic problems (e.g., isomorphs 
of the well-known Tower of Hanoi) can vary 
enormously, depending on perceptual cues 
(Kotovsky & Simon, 1990; see Novick & Bas- 
sok, Chap. 14). 

More generally, models of analogy have 
not been well integrated with models of 
problem solving (see Novick & Bassok, 
Chap. 14), even though analogy clearly af- 
fords an important mechanism for solving 
problems. In its general form, problem solv- 
ing requires sequencing multiple operators, 
establishing subgoals, and using combina- 
tions of rules to solve related but non- 
isomorphic problems. These basic require- 
ments are beyond the capabilities of virtually 
all computational models of analogy (but 
see Holyoak & Thagard, 1989b, for an 
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analogy within a rule-based problem-solving 
system). The most successful models of 
human problem solving have been formu- 
lated as production systems (see Lovett & 
Anderson, Chap. 17), and Salvucci and An- 
derson (2001) developed a model of anal- 
ogy based on the ACT-R production system. 
However, this model is unable to solve re- 
liably any analogy that requires integration 
of multiple relations — a class that includes 
analogies within the grasp of young children 
(Halford, 1993; Richland et al., 2004; see 
Halford, Chap. 22). The integration of anal- 
ogy models with models of general problem 
solving remains an important research goal. 

Perhaps the most serious limitation of 
current computational models of analogy is 
that their knowledge representations must 
be hand-coded by the modeler, whereas hu- 
man knowledge representations are formed 
autonomously. Closely related to the chal- 
lenge of avoiding hand-coding of represen- 
tations is the need to flexibly rerepresent 
knowledge to render potential analogies per- 
spicuous. Concepts often have a close con- 
ceptual relationship with more complex re- 
lational forms (e.g., Jackendoff, 1983). For 
example, causative verbs such as lift (e.g., 
“John lifted the hammer”) have very simi- 
lar meanings to structures based on an ex- 
plicit higher-order relation, cause (e.g., “John 
caused the hammer to rise”). In such cases, 
the causative verb serves as a “chunked” rep- 
resentation of a more elaborate predicate- 
argument structure. People are able to “see” 
analogies even when the analogs have very 
different linguistic forms (e.g., “John lifted 
the hammer in order to strike the nail” might 
be mapped onto “The Federal Reserve used 
an increase in interest rates as a tool in its 
efforts to drive down inflation”). A deeper 
understanding of human knowledge repre- 
sentation is a prerequisite for a complete the- 
ory of analogical reasoning. 
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CHAPTER 7 


Causal Learning 


Marc J. Buehner 
Patricia W. Cheng 


Introduction 


This chapter is an introduction to the psy- 
chology of causal inference using a compu- 
tational perspective with the focus on causal 
discovery. It explains the nature of the prob- 
lem of causal discovery and illustrates the 
goal of the process with everyday and hypo- 
thetical examples. It reviews two approaches 
to causal discovery, a purely statistical ap- 
proach and an alternative approach that in- 
corporates causal hypotheses in the infer- 
ence process. The latter approach provides 
a coherent framework within which to an- 
swer different questions regarding causal in- 
ference. The chapter ends with a discussion 
of two additional issues — the level of abstrac- 
tion of the candidate cause and the tempo- 
ral interval between the occurrence of the 
cause and the occurrence of the effect — and 
a sketch of future directions for the field. 


The Nature of the Problem and a 
Historical Review: Is Causality an 
Inscrutable Fetish or the Cement 
of the Universe? 


Imagine a world in which we could not rea- 
son about causes and effects. What would it 
be like? Typically, reviews about causal rea- 
soning begin by declaring that causal rea- 
soning enables us to predict and control 
our environment and by stating that causal 
reasoning allows us to structure an other- 
wise chaotic flux of events into meaningful 
episodes. In other words, without causal in- 
ference, we would be unable to learn from 
the past and incapable of manipulating our 
surroundings to achieve our goals. Let us 
see how a noncausal world would be grim 
and the exact role causal inference plays for 
adaptive intelligence. We illustrate the non- 
causal world by intuitive examples as well 
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other purely statistical models — models that 
do not go through an intermediate step of 
positing hypotheses about causal relations in 
the world rather than just in the head. 

We want to see the goals of causal reason- 
ing; we also want to see what the givens are, 
so we can step back and see what the prob- 
lem of causal learning is. One way of casting 
this problem is to ask, “What minimal set 
of processes would one endow an artificial 
system, so that when put on Planet Earth 
and given the types of information humans 
receive, it will evolve to represent the world 
as they do?” For example, what process must 
the system have so it would know that ex- 
posure to the sun causes tanning in skin but 
bleaching in fabrics? These causal facts are 
unlikely to be innate in humans. The learning 
process would begin with noncausal obser- 
vations. For both cases, the input would be 
observations on various entities (people and 
articles of clothing, respectively) with vary- 
ing exposures to sunlight and, in one case, 
the darkness of skin color and, in the other, 
the darkness of fabric colors. Consider an- 
other example: Suppose the system is pre- 
sented with observations that a rooster in 
a barn crowed soon before sunrise and did 
not crow at other times during the day when 
the sun did not rise. What process must the 
system have so it would predict that the 
sun would soon rise when informed that the 
rooster had just spontaneously crowed but 
would not predict the same when informed 
that the rooster had just been deceived into 
crowing by artificial lighting? Neither would 
the system recommend starting a round-the- 
clock solar energy enterprise even if there 
were reliable ways of making roosters crow. 
Nor would it, when a sick rooster is ob- 
served not to crow, worry about cajoling it 
into crowing to ensure that the sun will rise 
in the morning. In a noncausal world, such 
recommendations and worries would be nat- 
ural (also see Sloman & Lagnado, Chap. 5). 

Our examples illustrate that by keep- 
ing track of events that covary (i.e., vary 
together, are statistically associated), one 
would be able to predict a future event from 
a covariation provided that causes of that 


might be unable to predict the consequences 
of actions (e.g., exposure to the sun, deceiv- 
ing the rooster into crowing). Causation, and 
only causation, licenses the prediction of the 
consequences of actions. Both kinds of pre- 
dictions are obviously helpful (e.g., we ap- 
preciate weather reports), but the latter is 
what allows (1) goal-directed behaviors to 
achieve their goals and (2) maladaptive rec- 
ommendations that accord with mere corre- 
lations to be dismissed. The examples also 
illustrate that only causation supports ex- 
planation (Woodward, 2003). Whereas one 
would explain that one’s skin is tanned be- 
cause of exposure to the sun, one would not 
explain that the sun rises because the rooster 
crows, despite the reliable predictions that 
one can make in each case. Understanding 
what humans do when they reason about 
causation is a challenge, and the ability to 
build a system that accomplishes what hu- 
mans accomplish is a test of one’s under- 
standing of that psychological process. 

We see that even when there is temporal 
information so one can reliably predict an 
event from an earlier observation (e.g., sun- 
rise from a rooster’s crowing, a storm from a 
drop in the barometric reading), correlation 
need not imply causation. One might think 
that intervention (i.e., action, manipulation) 
is what differentiates between covariation 
and causation: When the observations are 
obtained by intervention, by oneself or oth- 
ers, the covariations are causal; otherwise, 
they are not necessarily causal. A growing 
body of research is dedicated to the role 
of intervention in causal learning, discov- 
ery, and reasoning (e.g., Gopnik et al., 2004; 
Lagnado & Sloman, 2004; Steyvers, Tenen- 
baum, Wagenmakers, & Blum, 2003). In- 
deed, the general pattern reported is that ob- 
servations based on intervention allow causal 
inferences that are not possible with mere 
observations. However, although interven- 
tion generally allows causal inference, it does 
not guarantee it. Consider a food allergy 
test that introduces samples of food into 
the body by needle punctures on the skin. 
The patient may react with hives on all 
punctured spots, and yet one may not know 
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foods. Suppose the patient’s skin is allergic 
to needle punctures so hives also appear on 
punctured spots without food. In this exam- 
ple, there is an intervention, but no causal 
inference regarding food allergy seems war- 
ranted (Cheng, 1997). What then are the 
conditions that allow causal discovery? Note 
that in this example the intervention was 
suboptimal because two interventions oc- 
curred concurrently (adding allergens into 
the bloodstream and puncturing the skin), 
resulting in confounding. 

Historically, causality has been the 
domain of philosophers, from Aristotle 
through to Hume and Kant, to name just a 
few. The fundamental challenge since Hume 
(1739/1888) that has been occupying schol- 
ars in this area is that causality per se is not 
directly in the input. This issue fits well in 
the framework of creating an artificial rea- 
soning system — causal knowledge has to 
emerge from noncausal input. Nothing in 
the evidence available to our sensory sys- 
tem can ensure someone of a causal rela- 
tion between, say, flicking a switch and the 
hallway lights turning on. Yet, we regularly 
and routinely have strong convictions about 
causality. David Hume made a distinction 
between analytic and empirical knowledge. 
Moreover, he pointed out that causal knowl- 
edge is empirical, and that of this kind of 
knowledge, we can only be certain of the 
states of observable events or objects (e.g., 
the presence of an event of interest and its 
magnitude) and the temporal and spatial re- 
lations between them. Any impression of 
causality linking two constituent events, he 
argued, is a mental construct. 

Psychologists entered the arena to study 
the exact nature and determinants of such 
mental constructs. Michotte (1946/1963) 
investigated the perceptual processing of 
causal events (mostly impact of one mov- 
ing object on another object, the “launching 
effect”). Many researchers since then have 
argued that such perception of causality is 
modular or encapsulated (for an overview, 
see Scholl & Tremoulet, 2000) and not sub- 
ject to conscious inference. To some, the 
encapsulation puts the process outside the 


however, the problem has the same general 
core: How would an intelligent system trans- 
form noncausal input into a causal relation 
as its output? That problem remains, despite 
the additional innate or learned spatiotem- 
poral constraints (see, Cheng, 1993, for an 
inductive analysis of the launching effect, 
and Scholl & Nakayama, 2002, for a demon- 
stration of inductive components of the vi- 
sual system’s analysis of launching events). 

Causal discovery is not the only process 
with which one would endow the artificial 
reasoning system. Many psychologists have 
addressed a related but distinct issue of up- 
dating and applying prior causal knowledge. 
Once causal knowledge is acquired, it would 
be efficient to apply it to novel situations in- 
volving events of like kind. We are all famil- 
iar with such applications of causal knowl- 
edge transmitted culturally or acquired on 
our own. A number of researchers have pro- 
posed Bayesian accounts of the integration 
of prior causal knowledge and current in- 
formation (Anderson, 1990; Tenenbaum & 
Griffiths, 2002). It may seem that there is 
a ready answer to the updating and appli- 
cation problem. What may not be straight- 
forward, however, is the determination of 
“events of like kind,” the variables in a causal 
relation. The application of causal knowl- 
edge therefore highlights an issue that has 
been mostly neglected in the research on 
causal discovery: What determines which 
categories are formed and the level of ab- 
straction at which they are formed (see 
Medin & Rips, Chap. 3; Rosch, 1978)? Sim- 
ilarly, what determines which events are re- 
garded as analogous (see Holyoak, Chap. 6)? 
The “cause” categories in causal learning ex- 
periments were typically predefined by the 
experimenter in terms of a variable with 
a single causal value and do not have the 
structure of natural categories (see Lien & 
Cheng, 2000, for an exception). If the rela- 
tions inferred have no generality, they can- 
not be applied to novel but similar events, 
thus failing to fulfill a primary function of 
causal inference. 

It is perhaps the segregation of re- 
search on category formation and on causal 
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view, which pits top-down and bottom-up 
causal reasoning against each other. It has 
been argued that inferring a causal connec- 
tion is contingent on insight into the mecha- 
nism (i.e, a network of intervening causal 
relations) by which the candidate cause 
brings about its effect (eg., Ahn, Kalish, 
Medin, & Gelman, 1995). A commonly 
used research paradigm involved providing 
participants with current information con- 
cerning the covariation between potential 
causes and effects at some designated level 
of abstraction but manipulating whether a 
(plausible) causal mechanism was presented 
(Ahn, Kalish, Medin, & Gelman, 1995; Bul- 
lock, Gelman, & Baillargeon, 1982; Shultz, 
1982; White, 1995), with the causal mecha- 
nism implying more reliable covariation in- 
formation at a different, more abstract, level. 
The common finding from these studies was 
that participants deemed knowledge about 
causal power or force as more significant 
than what was designated as covariational 
information. Studies in this “causal power” 
tradition are valuable in that they demon- 
strate the role of abduction and coherence: 
People indeed strive to link causes and 
effects mentally by postulating the (per- 
haps hypothetical) presence of some known 
causal mechanism that connects them in an 
attempt to create the most coherent ex- 
planation encompassing multiple relevant 
pieces of knowledge (see Holland, Holyoak, 
Nisbett, & Thagard, 1986, on abduction; see 
Thagard, 1989, for accounts of coherence). 
This work shows that coherence plays a key 
role in the application of causal knowledge 
(also see Lien & Cheng, 2000; coherence also 
plays a role in causal discovery, see Cheng, 
1993). However, the argument that inferring 
a causal relation is contingent on belief in 
an underlying causal network is circular — it 
simply pushes the causal discovery question 
one step back. How was knowledge about 
the links in the causal network discovered in 
the first place? 

Rather than pitting covariation and 
prior causal knowledge against each other, 
Thagard (2000) offered a complementary 
view of covariation, prior causal knowledge, 


tific explanation. Illustrating with cases in 
medical history (e.g., the bacterial theory of 
ulcers), he showed that inferring a causal 
connection is not contingent on insight into 
an intervening mechanism, but is bolstered 
by it. The inferred causal networks subse- 
quently explain novel instances when the 
networks are instantiated by information 
on the instances. Maximizing explanatory 
coherence might be a process closely inter- 
twined with causal discovery, but nonethe- 
less separate from it, that one would incor- 
porate in an artificial reasoning system. 

In the rest of this chapter, we review the 
main computational accounts of causal dis- 
covery. We first review statistical models, 
then problems with the statistical approach, 
problems that motivate a causal account that 
incorporates assumptions involving alterna- 
tive causes. We follow these accounts with 
a review of new empirical tests of the two 
approaches. We then broaden our scope to 
consider the possible levels of abstraction of 
a candidate cause and the analogous problem 
of the possible temporal lag of a causal rela- 
tion. These issues have implications for cat- 
egory formation. We end the chapter with a 
sketch of future research directions from a 
computational perspective. 


Information Processing Accounts 


A Statistical Approach 


OVERVIEW 


Some computational accounts of causal dis- 
covery are only concerned with statistical 
information (eg., Allan & Jenkins, 1980; 
Chapman & Robbins, 1990; Jenkins & Ward, 
1965; Rescorla & Wagner, 1972), ignoring 
hypotheses regarding unobservable causal 
relations (see Gallistel’s, 1990, critique of 
these models as being unrepresentational). 
Such accounts not only adopt Hume’s 
(1739/1888) problem but also his solution. 
To these theorists, causality is nothing more 
than a mental habit, a fictional epiphe- 
nomenon floating unnecessarily on the sur- 
face of indisputable facts. After all, causal 
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son, one of the fathers of modern statis- 
tics, subscribed to a positivist view and con- 
cluded that calculating correlations is the 
ultimate and only meaningful transforma- 
tion of evidence at our disposal: “Beyond 
such discarded fundamentals as ‘matter’ and 
‘force’ lies still another fetish amidst the in- 
scrutable arcana of modern science, namely, 
the category of cause and effect” (Pearson, 
1892/1957). Correlation at least enables one 
to make predictions based on observations 
even when the predictions are not accom- 
panied by causal understanding. 

Psychological work in this area was pi- 
oneered by social psychologists, most no- 
tably Kelley (1973), who studied causal at- 
tributions in interpersonal exchanges. His 
ANOVA model specifies a set of inference 
rules that indicate, for instance, whether 
a given outcome arose owing to particular 
aspects of the situation, the involved per 
son(s), or both. 

Around the same time in a different do- 
main (Pavlovian and instrumental condition- 
ing), prediction based on observations was 
also the primary concern. Predictive learn- 
ing in conditioning, often involving non- 
human animals, and causal reasoning in hu- 
mans showed so many parallels (Rescorla, 
1988) that associative learning theorists 
were prompted to apply models of condi- 
tioning to explain causal reasoning. Explain- 
ing causal learning with associative theories 
implies a mapping of causes to cues (or CSs) 
and effects to outcomes (or USs). In a de- 
tailed review, Shanks and Dickinson (1987; 
see Dickinson, 2001, for a more recent re- 
view) noted that the two cornerstones of 
associative learning, cue-outcome contingency 
and temporal contiguity, also drive human 
causal learning (also see Miller & Matute, 
1996). To a first approximation, association 
matters: The more likely that a cause will 
be followed by an effect, the stronger par- 
ticipants believe that they are causally re- 
lated. However, if this probability stays con- 
stant, but the probability with which the ef- 
fect occurs in the absence of the cause in- 
creases, causal judgments tend to decrease; 
in other words, it is contingency that matters 
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Figure 7.1. A standard 2 x 2 contingency table. 
A through D are labels for the frequencies of 
event types resulting from a factorial 
combination of the presence and absence of 
cause c and effect e. 


(see Rescorla, 1968, for a parallel demon- 
stration of the role of contingency in rats). 
As for temporal contiguity, Shanks, Pearson, 
and Dickinson (1989) showed that separat- 
ing cause and effect in time tends to decrease 
impressions of causality (see also Buehner & 
May, 2002, 2003, 2004). This pattern of re- 
sults, Shanks and Dickinson argued, parallels 
well-established findings from conditioning 
studies involving nonhuman animals. 

Contingency and temporal contiguity are 
conditions that enable causal learning. A 
robust feature of the resultant acquisition 
of causal knowledge is that it is gradual 
and can be described by a negatively ac- 
celerated learning curve with judgments 
reaching an equilibrium level under some 
conditions after sufficient training (Shanks, 
1985 a, 1987). 

A Statistical Model for Situations with One 
Varying Candidate Cause For situations in- 
volving only one varying candidate cause, 


an influential decision rule for almost four 
decades has been the AP rule: 


AP = p(elc)—P(elé) (Eq. 7.1) 


according to which the strength of the rela- 
tion between a binary cause c and effect e 
is determined by their contingency or proba- 
bilistic contrast — the difference between the 
probabilities of e in the presence and ab- 
sence of c (see, e.g., Allan & Jenkins, 1980; 
Jenkins & Ward, 1965). AP is estimated by 
relative frequencies. Figure 7.1 displays a 
contingency table where A and B represent 
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presence and absence of c, respectively, and 
C and D represent the frequencies of nonoc- 
currence of e in the presence and absence of 
c, respectively. P(e|c) is estimated by ach 
and P(e|é) is estimated by are 

If AP is positive, then cis believed to pro- 
duce e; if it is negative, then c is believed to 
prevent e; and if AP is zero, then c and e are 
not believed to be causally related to each 
other. Several modifications of the AP rule 
have been discussed (e.g., Anderson & Sheu, 
1995; Mandel & Lehman, 1998; Perales & 
Shanks, 2003; Schustack & Sternberg, 1981; 
White, 2002). All these modifications pa- 
rameterize the original rule in one way or 
another and thus, by allowing extra degrees 
of freedom, manage to fit certain aspects of 
human judgment data better than the orig- 
inal rule. What is common across all these 
models, however, is that they take covaria- 
tional information contained in the contin- 
gency table as input and transform it into a 
measure of causal strength as output without 
any consideration of the influence of alterna- 
tive causes. Whenever there is confounding 
by an alternative cause (observed or unob- 
served), the AP rule fails. 


A Statistical Model for Situations Involving 
Multiple Varying Candidate Causes Predic- 
tive learning, of course, is the subject of 
associative learning theory. An appeal of 
this approach is that it is sometimes capa- 
ble of explaining inference involving multi- 
ple causes. The most influential such theory 
(Rescorla & Wagner, 1972, and all its vari- 
ants since) is based on an algorithm of error 
correction driven by a discrepancy between 
the expected and actual outcomes. For each 
learning trial where the cue was presented, 
the model specifies 


AVcs = acs Bus(A _ xV ) (Eq. 7.2) 


where AV is the change in the strength of 
a given CS-US association on a given trial 
(CS = conditioned stimulus, eg., a tone; 
US = unconditioned stimulus, e.g., a foot- 
shock); w and f represent learning rate pa- 
rameters reflecting the saliencies of the CS 
and US, respectively; 4 stands for the ac- 


is present and o if it is absent); and XV is 
the expected outcome defined as the sum 
of all associative strengths of all CSs present 
on that trial. Each time a cue is followed by 
an outcome, the association between them is 
strengthened (up to the maximum strength 
US can support, 4); each time the cue is pre- 
sented without the outcome, the association 
weakens (again within certain boundaries, 
—XV, to account for preventive cues). 

For situations involving only one varying 
cue, its mean weight at equilibrium accord- 
ing to the RW algorithm has been shown 
to equal AP if the value of 8 remains the 
same when the US is present and when it 
is absent (for the 4 values just mentioned; 
Chapman & Robbins, 1990). In other words, 
this simple and intuitive algorithm elegantly 
explains why causal learning is a function 
of contingency. It also explains a range of 
results for designs involving multiple cues 
such as blocking (see “Blocking: Illustrating 
an Associationist Explanation” section), con- 
ditioned inhibition, overshadowing, and cue 
validity (Miller, Barnet, & Grahame, 1995). 
For some of these designs, the mean weight 
of a cue at equilibrium has been shown to 
equal AP conditional on the constant pres- 
ence of other cues that occur in combina- 
tion with that cue (see Cheng, 1997; Danks, 
2003). Danks derived the mean equilibrium 
weights for a larger class of designs. 


BLOCKING: ILLUSTRATING AN ASSOCIATIONIST 
EXPLANATION 

Beyond the cornerstones, the parallels be- 
tween conditioning and human causal learn- 
ing are manifested across numerous experi- 
mental designs often called paradigms in the 
literature. One parallel involves the block- 
ing paradigm. Using a Pavlovian condition- 
ing paradigm, Kamin (1969) established cue 
B as a perfect predictor for an outcome (B+, 
with “+” representing the occurrence of the 
outcome). In a subsequent phase, animals 
were presented with a compound consist- 
ing of B and a new, redundant cue A. The 
AB compound was also always followed by 
the outcome (AB+), yet A received little 
conditioning; its conditioning was blocked by 
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maximum associative strength supported by 
the stimulus. Because the association be- 
tween B and the outcome is already at 
asymptote when A is introduced, there is no 
error left for A to explain. In other words, 
the outcome is already perfectly predicted 
by B, and nothing is left to be predicted by 
A, which accounts for the lack of condition- 
ing to cue A. Shanks (1985b) replicated the 
same finding in a causal reasoning experi- 
ment with human participants, although the 
human responses seem to reflect uncertainty 
of the causal status of A rather than cer- 
tainty that it is noncausal (e.g., Waldmann & 
Holyoak, 1992). 


FAILURE OF THE RW ALGORITHM TO TRACK 
COVARIATION WHEN A CUE IS ABSENT 

The list of similarities between animal condi- 
tioning and human causal reasoning seemed 
to grow, prompting the interpretation that 
causal learning is nothing more than asso- 
ciative learning. However, Shanks’ (1985b) 
results also revealed evidence for back- 
ward blocking; in fact, there is evidence for 
backward blocking even in young children 
(Gopnik et al., 2004). In this procedure, the 
order of learning phases is simply reversed; 
participants first learn about the perfect re- 
lation between AB and the outcome (AB+) 
and subsequently learn that B by itself is also 
a perfect predictor (B+). Conceptually, for- 
ward and backward blocking are identical — 
at least from a causal perspective. A causal 
explanation might go: If one knows that A 
and B together always produce an effect, and 
one also knows that B by itself also always 
produ- ces the effect, one can infer that B is 
a strong cause. A, however, could be a cause, 
even a strong one, or noncausal; its causal sta- 
tus is unclear. Typically, participants express 
such uncertainty with low to medium rat- 
ings relative to ratings from control cues that 
have been paired with the effect an equal 
number of times (see Cheng, 1997, for a 
review). 

Beyond increasing susceptibility to atten- 
tion and memory biases (primacy and re- 
cency, cf. for example, Dennis & Ahn, 2001), 
there is no reason why the temporal order 


quired should play a role under a causal 
learning perspective. This is not so under 
an associative learning perspective, however. 
The standard assumption here is that the 
strength of a cue can only be updated when 
that cue is present. In the backward block- 
ing paradigm, however, participants retro- 
spectively alter their estimate of A on the B+ 
trials in phase 2. In other words, the AP of A, 
conditional on the presence of B, decreases 
over a course of trials in which A is actually 
absent, and the algorithm fails to track the 
covariation for A. 

Several modifications of RW have been 
proposed to allow the strengths of absent 
cues to be changed, for instance, by setting 
the learning parameter a negative on trials 
where the cue is absent (see Dickinson & 
Burke, 1996; Van Hamme & Wasserman, 
1994). Such modifications can explain back- 
ward blocking and some other findings 
showing retrospective revaluation (see, eg., 
Larkin, Aitken, & Dickinson, 1998; for an 
extensive review of modifications to asso- 
ciative learning models applicable to human 
learning, see De Houwer & Beckers, 2002). 
However, they also oddly predict that one 
will have difficulty learning that there are 
multiple sufficient causes of an effect. For 
example, if one sometimes drinks both tea 
and lemonade, then learning that tea alone 
can quench thirst will cause one to unlearn 
that lemonade can quench thirst. They also 
fail when two steps of retrospective revalua- 
tion are required. Macho and Burkart (2002) 
demonstrated that humans are capable of it- 
erative retrospective revaluation, a backward 
process whereby the causal strength of a 
target cause is disambiguated by evaluating 
another cause, which in turn is evaluated 
by drawing on information about a third 
cause (see also Lovibond, Been, Mitchell, 
Bouton, & Frohardt, 2003, for further evi- 
dence that blocking in human causal reason- 
ing is inferential, and De Houwer, 2002, for 
a demonstration that even forward blocking 
recruits retrospective inferences). In these 
cases, AP with other cues controlled coin- 
cides with causal intuitions, but associative 
models fail to track conditional AP. 
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Covariation Tracking 


PREDICTIONS BY THE STATISTICAL VIEW 
AND THE CAUSAL MECHANISM VIEW ON 

SOME INTUITIVE EXAMPLES 

Even successfully tracked covariation, how- 
ever, does not equal causation, as we il- 
lustrated earlier and as every introductory 
statistics text warns. None of these cases can 
be explained by the AP rule in Eq. (7.1). For 
example, even if the AP for rooster crow- 
ing is 1, nobody would claim that the crow- 
ing caused the sun to rise. Although the 
candidate cause, crowing, covaries perfectly 
with the effect, sunrise, there is an alterna- 
tive cause that covaries with the candidate: 
Whenever the rooster crows, the Earth’s ro- 
tation is just about to bring the farm toward 
the sun. Our intuition would say that be- 
cause there is confounding, one cannot draw 
any causal conclusion. This pattern of in- 
formation fits the overshadowing design. If 
crowing is the more salient of the two con- 
founded cues, then RW would predict that 
crowing causes sunrise. 

Let us digress for a moment to consider 
what the causal mechanism view predicts. 
Power theorists might argue that the ab- 
sence of a plausible mechanism whereby a 
bird could influence the motion of stellar 
objects, rather than anything that has to do 
with covariation, is what prevents us from 
erroneously inducing a causal relation. In 
this example, in addition to the confound- 
ing by the Earth’s rotation, there happens 
to be prior causal knowledge, specifically, of 
the noncausality of a bird’s crowing with re- 
spect to sunrise. Tracing the possible origin of 
that knowledge, however, we see that we do 
have covariational information that allows 
us to arrive at the conclusion that the re- 
lation is noncausal. If we view crowing and 
sunrise at a more general level of abstrac- 
tion, namely, as sound and the movement 
of large objects, we no longer have the con- 
founding we noted at the specific level of 
crowing and sunrise. We have observed that 
sounds, when manipulated at will so alter- 
native causes do occur independently of the 
candidate, thus allowing causal inference, do 
not move large objects. Consequently, crow- 


sunrise (and does not belong to any category 
that does cause sunrise), and the confounded 
covariation between crowing and sunrise is 
disregarded as spurious. 

Our consideration shows that, contrary 
to the causal mechanism view, prior knowl- 
edge of noncausality neither precludes nor 
refutes observation-based causal discovery. 
Thagard (2000) gave a striking historic illus- 
tration of this fact. Even though the stomach 
had been regarded as too acidic an environ- 
ment for viruses to survive, a virus was in- 
ferred to be a cause of stomach ulcer. Prior 
causal knowledge may render a novel can- 
didate causal relation more or less plausible 
but cannot rule it out definitively. Moreover, 
prior causal knowledge is often stochastic. 
Consider a situation in which one observes 
that insomia results whenever one drinks 
champagne. Now, there may be a straightfor- 
ward physiological causal mechanism link- 
ing cause and effect, but it is also plausible 
that the relation is not causal; it could eas- 
ily be that drinking and insomnia are both 
caused by a third variable — for example, at- 
tending parties (cf. Gopnik et al., 2004). 

Returning to the pitfall of statistical and 
associative models, besides the confounding 
problem, we find that there is the overde- 
termination problem, where two or more 
causes covary with an effect, and each cause 
by itself would be sufficient to produce 
the effect. The best-known illustration of 
overdetermination is provided by Mackie 
(1974): Imagine two criminals who both 
want to murder a third person who is about 
to cross a desert; unaware of each other’s 
intentions, one criminal puts poison in the 
victim’s water bottle, while the other punc- 
tures the bottle. Each action on its own co- 
varies perfectly with the effect, death, and 
would have been sufficient to bring the ef- 
fect about. However, in the presence of the 
alternative cause of death (a given fact in 
this example), so that there is no confound- 
ing, varying each candidate cause in this case 
makes no difference; for instance, the AP for 
poison with respect to death, conditional on 
the presence of the puncturing of the wa- 
ter canteen, is o! So, Mackie’s puzzle goes, 
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the murderer? Presumably, a lawyer could 
defend each criminal by arguing that their 
respective deed made no difference to the 
victim's ultimate fate — he would have died 
anyway as a result of the other action (but 
see Katz, 1989; also see Ellsworth, Chap. 
28; Pearl, 2000; and Wright, 1985, on ac- 
tual causation). Mackie turned to the actual 
manner of death (by poison or by dehydra- 
tion) for a solution. But, suppose the death 
is discovered too late to yield useful autopsy 
information. Would the desert traveler then 
have died without a cause? Surely our in- 
tuition says no: The lack of covariation in 
this case does not imply the lack of causation 
(see Ellsworth, Chap. 28; Spellman & Kin- 
cannon, 2001, for studies on intuitive judg- 
ments in situations involving multiple suff- 
cient causes). What matters is the prediction 
of the consequences of actions, such as poi- 
soning, which may or may not be revealed 
in the covariation observed in a particular 
context. 


Empirical Findings on Humans and Rats 


The observed distinction between covaria- 
tion and causation in the causal learning liter- 
ature corroborates intuitive judgment in the 
rooster and desert traveler examples. It is no 
wonder that Pearson’s condemnation of the 
concept of causality notwithstanding, con- 
temporary artificial intelligence has whole- 
heartedly embraced causality (see, for exam- 
ple, Pearl, 2000). We now review how hu- 
man causal reasoning capacities exceed the 
mere tracking of stimulus—outcome associa- 
tions. 


THE DIRECTION OF CAUSALITY 


As mentioned earlier, correlations and asso- 
ciations are bidirectional (for implications of 
the bidirectional nature of associations on 
conditioning, see, e.g., Miller & Barnet, 1993; 
and Savastano & Miller, 1998) and thus can- 
not represent directed causal information. 
However, the concept of causality is funda- 
mentally directional (Reichenbach, 1956) in 
that causes produce effects, but effects can- 
not produce causes. This directionality con- 


of an effect. A straightforward demonstra- 
tion that humans are sensitive to the direc- 
tion of the causal arrow was provided by 
Waldmann and Holyoak (1992). 

A corollary of the directional nature of 
the causal arrow, Waldmann and Holyoak 
(1992) reasoned, is that only causes, but 
not effects, should “compete” for explana- 
tory power. Let us first revisit the blocking 
paradigm with a causal interpretation. If B 
is a perfect cause of an outcome O, and A 
is only presented in conjunction with B, one 
has no basis of knowing to what extent, if 
at all, A actually produces O. Consequently, 
the predictiveness of A should be depressed 
relative to B in a predictive situation. How- 
ever, if B is a consistent effect of O, there is no 
reason why A cannot also be an equally con- 
sistent effect of O. Alternative causes need 
to be kept constant to allow causal inference, 
but alternative effects do not. Consequently, 
the predictiveness of A should not be de- 
pressed in a diagnostic situation. 

This asymmetric prediction was tested us- 
ing scenarios to manipulate whether a vari- 
able is interpreted as a candidate cause or 
an effect without changing the associations 
between variables. For example, participants 
had to learn the relation between several 
light buttons and the state of an alarm sys- 
tem. The instructions introduced the but- 
tons as causes for the alarm in the predictive 
condition but as potential consequences of 
the state of the alarm system in the diagnos- 
tic condition. 

As predicted: There was blocking in the 
predictive condition, but not in the diagnos- 
tic condition. These results reveal that hu- 
mans are sensitive to, and make use of, the 
direction of the causal arrow. 

Associationists in fact have no reason 
for objecting to using temporal information. 
Unlike causal relations, temporal ordering is 
observable. To address the problem raised by 
Waldmann and Holyoak (1992), association- 
ist models can specify that, when applied 
to explain causal learning, candidate causes 
can precede their effects, but not vice versa, 
and that the temporal ordering that counts is 
that of the actual occurrence of events rather 
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soner. Previous associationist models, how- 
ever, have not made a distinction between 
occurrence and presentation order. There- 
fore, by default, they treat the buttons, for 
which information was presented first, as 
cues and the alarm, for which information 
was presented second, as an outcome, and 
hence, predict equal amounts of cue com- 
petition in both scenarios. 

Instead of amending associationist mod- 
els to treat the order of actual occurrence 
as critical, which would be natural un- 
der a computational approach, researchers 
criticized Waldmann and Holyoak’s (1992) 
findings on technical grounds (Matute, 
Arcediano, & Miller, 1996; Shanks & Lopez, 
1996). Follow-up work from Waldmann’s 
lab (Waldmann, 2000, 2001; Waldmann & 
Holyoak, 1997), however, has demonstrated 
that the asymmetry in cue competition is in- 
deed a robust finding (Waldmann, 2001). 


CEILING EFFECTS AND PEOPLE'S SENSITIVITY 

TO PROPER EXPERIMENTAL DESIGN 

A revealing case of the distinction between 
covariation and causation has to do with 
what is known in experimental design as 
a ceiling effect. This case does not involve 
any confounding. We illustrate it with the 
preventive version of the effect, which is 
never covered in courses on experimental 
design — the underlying intuition is so pow- 
erful it needs no instructional augmenta- 
tion. Imagine that a scientist conducts an 
experiment to find out whether a new drug 
cures migraine. She follows the usual proce- 
dure and administers the drug to an exper- 
imental group of patients, while an equiv- 
alent control group receives a placebo. At 
the end of the study, the scientist discov- 
ers that none of the patients in the ex- 
perimental group, but also none of the pa- 
tients in the control group, suffered from 
migraine. If we enter this information into 
the AP rule, we see that P(elc) = o and 
P(e|é) = 0, yielding AP = o. According to 
the AP rule and RW, this would indicate 
that there is no causal relation; that is, the 
drug does not cure migraine. Would the sci- 


would instead recognize that she has con- 
ducted a poor experiment. For some reason, 
her sample suffered from a preventive ver- 
sion of the ceiling effect — the effect never 
occurred, regardless of the manipulation. If 
the effect never occurs in the first place, how 
can a preventive intervention be expected to 
prove its effectiveness? 

Even rats seem to appreciate this argu- 
ment. When an inhibitory cue, that is, one 
with negative associative strength, is repeat- 
edly presented without the outcome so that 
the actual outcome is o whereas the ex- 
pected outcome is negative, associative mod- 
els would predict that the cue reduces its 
strength toward o. That is, in a noncausal 
world, we would unlearn our preventive 
causes whenever they are not accompanied 
by a generative cause. For example, when we 
inoculate child after child with polio vac- 
cine in a country and there is no occur- 
rence of polio in that country, we would 
come to believe that the polio vaccine does 
not function anymore (rather than merely 
that it is not needed). To the contrary, even 
for rats, the inhibitory cue retains its nega- 
tive strength (Zimmerhart-Hart & Rescorla, 
1974). In other words, when an outcome in 
question never occurred, both when a condi- 
tioned inhibitory cue was present and when 
it was not, the rats apparently treated the 
zero AP value as uninformative and retained 
the inhibitory status of the cue. In this case, 
in spite of a discrepancy between the ex- 
pected and actual outcomes, there is no re- 
vision of causal strength. We are not aware 
of any modification of associative algorithms 
that can accomodate this finding. 

Notice that in the hypothetical migraine 
experiment, one can in fact conclude that 
the drug does not cause migraine. Thus, given 
the exact same covariation, one’s conclu- 
sion differs depending on the direction of 
influence under evaluation (generative vs. 
preventive). Wu and Cheng (1999) con- 
ducted an experiment that showed that 
beginning college students, just like experi- 
enced scientists, refrain from making causal 
inferences in the generative and preventive 
ceiling effects situations. People’s preference 
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uations is at odds with purely covariational 
or associative accounts. What must the pro- 
cess of human causal induction involve so 
it will reflect people’s unwillingness to en- 
gage in causal inference in such situations? 
More generally, what must this process in- 
volve so it will distinguish causation from 
mere covariation? 


A Causal Network Approach 


A solution to the puzzle posed by the 
distinction between covariation and causa- 
tion is to test hypotheses involving causal 
structures (Cheng, 1997; Novick & Cheng, 
2004; Pearl, 1988, 2000; Spirtes, Glymour, & 
Scheines, 1993/2000). Pearl (2000) and 
Spirtes et al. (1993/2000) developed a for 
mal framework for causal inference based on 
causal Bayesian networks. In this framework, 
causal structures are represented as directed 
acyclic graphs, graphs with nodes connected 
by arrows. The nodes represent variables, 
and each arrow represents a direct causal re- 
lation between two variables. “Acyclic” refers 
to the constraint that the chains formed by 
the arrows are never loops. The graphs are 
assumed to satisfy the Markov condition, 
which states that for any variable X in the 
graph, for any set S of variables in the graph 
not containing any direct or indirect effects 
of X, X is jointly independent of the vari- 
ables in S conditional on any set of values 
of the set of variables that are direct causes 
of X (see Pearl, 1988, 2000; Spirtes et al., 
1993/2000). An effect of X is a variable that 
has (1) an arrow directly from X pointing 
into it or (2) a pathway of arrows originating 
from X pointing into it. Gopnik et al. (2004) 
proposed that people are able to assess pat- 
terns of conditional independence using the 
Markov assumption and infer entire causal 
networks from the patterns. Cheng (1997) 
proposed instead that people (and perhaps 
other species) evaluate one causal relation 
in a network at a time while taking into 
consideration other relations in the network. 
Clearcut evidence discriminating between 
these two variants is still unavailable. 


OF CAUSAL INDUCTION 


Cheng (1997)’s power PC theory (short for a 
causal power theory of the probabilistic con- 
trast model) starts with the Humean con- 
straint that causality can only be inferred 
using observable evidence (in the form of 
covariations and temporal and spatial infor- 
mation) as input to the reasoning process. 
She combines that constraint with Kant’s 
(1781/1965) postulate that reasoners have an 
a priori notion that types of causal relations 
exist in the universe. This unification can 
best be illustrated with an analogy. Accord- 
ing to Cheng, the relation between a causal 
relation and a covariation is like the relation 
between a scientific theory and a model. Sci- 
entists postulate theories (involving unob- 
servable entities) to explain models (i.e., ob- 
served regularities or laws); the kinetic the- 
ory of gases, for example, is used to explain 
Boyle’s law. Boyle’s law describes an observ- 
able phenomenon, namely that pressure x 
volume = constant (under certain boundary 
conditions), and the kinetic theory of gases 
explains in terms of unobservable entities 
why Boyle’s law holds (gases consist of small 
particles moving at a speed proportional to 
their temperature, and pressure is generated 
by the particles colliding with the walls of 
the container). Likewise, a causal relation is 
the unobservable entity that reasoners hope 
to infer in order to explain observable regu- 
larities between events (Cheng, 1997). 

This distinction between a causal relation 
as a distal, postulated entity and covariation 
as an observable, proximal stimulus implies 
that there can be situations in which there is 
observable covariation but causal inference 
is not licensed. Computationally, this means 
that causality is represented as an unbound 
variable (cf Doumas & Hummel, Chap. 4; 
Holyoak & Hummel, 2000) represented sep- 
arately and not bound to covariation, allow- 
ing situations in which covariation has a def- 
inite value (e.g., 0, as in the ceiling effect) 
but causal power has no value. Traditional 
models (Allan & Jenkins, 1980; Anderson & 
Sheu, 1995; Jenkins & Ward, 1965; Man- 
del & Lehman, 1998; Schustack & Sternberg, 
1981; White, 2002; and Rescorla & Wagner, 
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not represent causality as a separate vari- 
able. Hence, whenever there is observed 
covariation, they will always compute a defi- 
nite causal strength. In an analogy to percep- 
tion, one could say that such models never 
go beyond describing features of the proxi- 
mal stimulus (observable evidence — covari- 
ation or image on the retina) and fail to infer 
features of the distal stimulus (causal power 
that produced the covariation or object in 
the 3D world that produced retinal images). 

How then does the power PC theory 
(Cheng, 1997) go beyond the proximal stim- 
ulus and explain the various ways in which 
covariation does not imply causation? The 
first step in the solution is the inclusion 
of unobservable entities, including the de- 
sired unknown, the distal causal relation, in 
the equations. The theory partitions all (ob- 
served and unobserved) causes of effect e 
into the candidate cause in question, c, and 
a, a composite of all alternative causes of 
e. The unobservable probability with which 
c produces e (in other words, the probabil- 
ity that e occurs as a result of c’s occurring) 
is termed the generative power of c, repre- 
sented by q, here. When AP > o, q- is the 
desired unknown. Likewise, when AP < 0, 
the preventive power of c is the desired un- 
known. Two other relevant theoretical un- 
knowns are q,, the probability with which 
a produces e when it occurs, and P(a), the 
probability with which a occurs. The com- 
posite a may include unknown and there- 
fore unobservable causes. Because any causal 
power may have a value of o, or even 
no value at all, these variables are merely 
hypotheses — they do not presuppose that 
c and a indeed have causal influence on e. 
The idea of a cause producing an effect and 
the idea of a cause preventing an effect are 
primitives in the theory. 

On the assumption that c and a influ- 
ence e independently, the power PC the- 
ory explains the two conditional probabil- 
ities defining AP as follows: 


P(e|c)=qce+ P(a|c)-qa—4ce-P(a|c)-qa 
(Eq. 7-3) 


P(e |¢) = P(a|¢)-da (Eq. 7.4) 


has occurred, e is produced by c or by the 
composite a, nonexclusively (e is jointly pro- 
duced by both with a probability that fol- 
lows from the independent influence ofc and 
aon e). Equation (7.4) “explains” that given 
that c did not occur, e is produced by a alone. 
It follows from Eqs. (7.3) and (7.4) that 


AP.= de> Pla | ¢)=q4 =de> Pla \c)- aa 
—P(a|¢)- qa (Eq. 7.5) 


From Eq. (7.5), it can be seen that unless 
c and a occur independently, there are four 
unknowns: —4e, qa, P(a|c), and P(a | é); it 
follows that, in general, despite AP’s having a 
definite value, there is no unique solution for 
4c. This failure corresponds to our intuition 
that covariation need not imply causation — 
an intuition that purely covariational models 
are incapable of explaining. 

In the special case in which a occurs inde- 
pendently of c (e.g., when alternative causes 
are held constant), Eq. (7.5) simplifies to 


Eq. (7-6), 


AP 


1 Pela (Eq. 7.6) 


Ac = 


in which all variables besides q, are observ- 
able. In this case, gq, can be solved. Being 
able to solve for gq, only under the condi- 
tion of independent occurrence explains why 
manipulation by free will encourages causal 
inference (the principle of control in ex- 
perimental design and everyday reasoning). 
When one manipulates a variable, that deci- 
sion by free will is likely to occur indepen- 
dently of alternative causes of that variable. 
At the same time, the condition of indepen- 
dent occurrence explains why causal infer- 
ences resulting from interventions are not 
always correct. Alternative causes are un- 
likely to covary with one’s decision to ma- 
nipulate, but sometimes they may, as the 
food allergy example illustrates. Note that 
the principle of “no confounding” is a result 
in this theory, rather than an unexplained 
axiomatic assumption, as it is in current 
scientific methodology (also see Dunbar & 
Fugelsang, Chap. 29). 
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Now it is also obvious how the power PC 
theory can explain why the ceiling effects 
block causal inference (even when there is 
no confounding) and do so under differ- 
ent conditions. In the generative case, e al- 
ways occurs, regardless of the manipulation; 
hence, P(e | c) = P(e | €) =1, leaving q in 
Eq. (7.6) with an undefined value. In con- 
trast, in the preventive case, e never occurs 
again regardless of the manipulation; there- 
fore, P(e |c) = P(e |é) =o, leaving p, in 
Eq. (7.7) with an undefined value. 

Although the theory distinguishes be- 
tween generative and preventive causal 
powers, this distinction does not constitute 
a free parameter. Which of the two equa- 
tions applies readily follows from the value 
of AP. On occasions where AP = 0, both 
equations apply and make the same predic- 
tion, namely, that causal power should be o 
except in ceiling effect situations. Here, the 
reasoner has to make a pragmatic decision 
on whether he or she is evaluating the ev- 
idence to assess a preventive or generative 
relation, and whether the evidence at hand 
is meaningful or not for that purpose. 

Most causes are complex, involving not 
just a single factor but a conjunction of fac- 
tors operating in concert. In other words, 
the assumption made by the power PC the- 
ory that c and a influence e independently 
is false most of the time. When this as- 
sumption is violated, if an alternative cause 
(part of a) is observable, the independent 
influence assumption can be given up for 
that cause, and progressively more complex 
causes can be evaluated using the same dis- 
tal approach that represents causal powers. 
This approach has been extended to eval- 
uate conjunctive causes involving two fac- 
tors (see Novick & Cheng, 2004). Even if 
alternative causes are not observable, how- 
ever, Cheng (2000) showed that as long as 
they occur with about the same probability 
in the learning context as in the generaliza- 
tion context, predictions according to simple 


HepsHiasipipneggocomausal power involving a single factor will 


hold. That is, under that condition, it does 
not matter what the reasoner assumes about 
the independent influence of c and a one. 


EXPERIMENTAL TESTS OF A COMPUTATIONAL 
CAUSAL POWER APPROACH 

The predictions made by the power PC the- 
ory and by noncausal accounts differ in di- 
verse ways. We review three of these dif- 
ferences in this section. The first concerns 
a case in which covariation does not equal 
causation. The second concerns a qualitative 
pattern of the influence of P(e | €), the base 
rate of e, for candidate causes with the same 
AP. The third concerns the flexible and co- 
herent use of causal power to make causal 
predictions. 


More Studies on Covariation and Causa- 
tion We have already mentioned Wu and 
Cheng’s (1999) study on ceiling situations, 
showing that they distinguish covariation 
from causation. Lovibond et al. (2003) re- 
ported a further test of this distinction. Their 
experiments are not a direct test of the 
power PC theory because they do not in- 
volve binary variables only. They do, how- 
ever, test the same fundamental idea under- 
lying a distal approach. That is, to account 
for the distinction between covariation and 
causation, there must be an explicit repre- 
sentation of unobservable causal relations. 

Lovibond et al. (2003) tested human sub- 
jects on “backward blocking” and on “re- 
lease from overshadowing,” when the out- 
come (an allergic reaction to some food) 
occurred at what the subjects perceived as 
the “ceiling” level for one condition and at 
an intermediate level for another condition. 
The release-from-overshadowing condition 
involved a retrospective design, and differed 
from the backward blocking condition only 
in that, when the blocking cue B (the cue 
that did appear by itself) appeared, the out- 
come did not occur. Thus, considering the 
effect of cue A, the cue that never appeared 
by itself, with cue B held constantly present, 
one sees that introducing A made a differ- 
ence to the occurrence of the outcome. This 
nonzero AP implies causality, regardless of 
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compound) at a ceiling or nonceiling level. 

The critical manipulation was a “pretrain- 
ing compound” phase during which one 
group of subjects, the ceiling group, saw 
that a combination of two allergens pro- 
duced an outcome at the same level (“an 
allergic reaction”) as a single allergen (i.e., 
the ceiling level). In contrast, the nonceil- 
ing group saw that a combination of two 
allergens produced a stronger reaction (“a 
STRONG allergic reaction”) than a single 
allergen (“an allergic reaction”). Following 
this pretraining phase, all subjects were pre- 
sented with information regarding various 
cues and outcomes according to their assign- 
ment to the backward-blocking or release- 
from-overshadowing groups. Critically, the 
outcome in this main training phase always 
only occurred at the intermediate level (“an 
allergic reaction”) for both the ceiling and 
nonceiling groups. Ingeniously, as a result of 
pretraining, subjects’ perception of the level 
of the outcome in the main phase would be 
expected to differ. For the exact same out- 
come, “an allergic reaction,” the only form 
of the outcome then, whereas the ceiling 
group would perceive it to occur at the ceil- 
ing level, the nonceiling group would per- 
ceive it to occur at an intermediate level. For 
the backward-blocking condition for both 
groups, cue A made no difference to the 
occurrence of the outcome (holding B con- 
stant, there was always a reaction whether or 
not A was there). However, as explained by 
the power PC theory, whereas a AP of 0 im- 
plies noncausality (i.e., a causal rating of o) 
when the outcome occurred at a nonceiling 
level, the same value does not allow causal 
inference when the outcome occurred at a 
ceiling level. In support of this interpreta- 
tion, the mean causal rating for cue A was 
reliably lower for the nonceiling group than 
for the ceiling group. In contrast, recovery 
from overshadowing was not dependent on 
whether or not the outcome was perceived 
to occur at a ceiling level. 

Why does the level at which the outcome 
was perceived to occur lead to different re- 
sponses in the backward-blocking condition 
but not in the release-from-overshadowing 


for associative accounts. Both designs in- 
volved retrospective revaluation, but even 
modifications of associative models that ex- 
plain retrospective revaluation cannot ex- 
plain this difference. In contrast, a simple 
and intuitive answer follows from a causal 
account. 


Base Rate Influence on Conditions with 
Identical AP Several earlier studies on hu- 
man contingency judgment have reported 
that, although AP clearly influences 
causal ratings (e.g., Allan & Jenkins, 1980; 
Wasserman, Elek, Chatlosh, & Baker, 1993), 
for a given level of AP, causal ratings diverge 
from AP as the base rate of the effect e, 
P(e | ) increases. If we consider Eq. (7.6) 
(the power PC theory) for any constant 
positive AP, causal ratings should increase 
as P(e | ) increases. Conversely, according 
to Eq. (7.7), preventive causal ratings 
should decrease as P(e | é) increases for 
the same negative AP. Zero contingencies, 
however, regardless of the base rate of e, 
should be judged as noncausal (except 
when judgment should be withheld due 
to ceiling effects). No other current model 
of causal learning predicts this qualitative 
pattern of the influence of the base rate 
of e, although some covariational or asso- 
ciative learning models can explain one or 
another part of this pattern given felicitous 
parameter values. For example, in the RW, 
if Bus > Bos, causal ratings will always 
increase as base rate increases, whereas 
the opposite trend would be obtained if 
the parameter ordering were reversed. 
Another prominent associative learning 
model, Pearce’s (1987) model of stimulus 
generalization, can likewise account for 
opposite base rate influences in positive and 
negative contingencies if the parameters are 
set accordingly, but this model would then 
additionally predict a base rate influence on 
noncontingent conditions. 

Figure 7.2 illustrates the intuitiveness of 
a rating that deviates from AP. The rea- 
soning is counterfactual. P(e | ¢) estimates 
the “expected” probability of e in the pres- 
ence of c if c had been absent so that only 
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condition in Buehner et al. (2003). 


causes other than c exerted an influence on e. 
A deviation from this counterfactual proba- 
bility indicates that c is a simple cause of 
e. Under the assumption that the patients 
represented in the figure were randomly as- 
signed to the two groups, one that received 
the drug and another that did not, one would 
reason that about one-third of the patients 
in the “drug” group would be expected to 
have headaches if they had not received 
the drug. The drug then would be the sole 
cause of headaches among the two-thirds 
who did not already have headaches caused 
by other factors. In this subgroup, headaches 
occurred in three-fourths of the patients. 
One might therefore reason, although AP = 
1/2, that the probability the drug will pro- 
duce headaches is three-fourths. 

The initial attempts to test the power PC 
theory yielded mixed results. Buehner and 
Cheng (1997; see Buehner, Cheng, & Clif- 
ford,2003 for a more detailed report) var- 
ied the base rate of e for conditions with 
the same value of AP using a sequential 
trial procedure and demonstrated that base 
rate indeed influences the evaluation of pos- 


itive and negative contingencies in the way 
that power PC predicts. However, contrary 
to the predictions of the power PC the- 
ory, Buehner and Cheng (1997) also found 
that base rate did not only influence con- 
tingent conditions with equal AP values but 
also influenced noncontingent conditions (in 
which AP = o). The latter, a robust re- 
sult (see Shanks 1985a; 1987; and Shanks, 
Holyoak & Mediu, 1996, for a review) seems 
nonsensical if AP had in fact been o in the 
input to the reasoner. Furthermore, they 
also found that comparisons between cer- 
tain conditions where causal power [as de- 
fined in Eqs. (7.6) and (7.7)] was constant 
but AP varied showed variations in the di- 
rection of AP, as predicted by the RW and 
Pearce model. 

Many researchers treated Buehner and 
Cheng’s (1997) and similar results (Lober & 
Shanks, 2000) as a given and regarded the 
findings that deviated from the predictions 
of the power PC theory as refutations of 
it. Lober and Shanks (2000) concluded 
that these results fully support RW, even 
though they had to use opposite parameter 
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didates, as was the case for preventive candi- 
dates, to fit the data. Similarly, Tenenbaum 
and Griffiths (2001) concluded that these re- 
sults support their Bayesian causal support 
model, which evaluates how confident one 
is that c causes e. It does so by comparing 
the posterior probabilities of two causal net- 
works, both of which have a background 
cause that is constantly present in the learn- 
ing context, differing only in that one net- 
work has an arrow between c and e. When 
the posterior probability of the network with 
the extra arrow is greater than that without 
the arrow, then one decides that c causes e. 
Otherwise, one decides that c does not cause 
e. Support is defined as the log of the ratio 
of the two posterior probabilities. 


DEVIATIONS FROM NORMATIVITY 
AND AMBIGUOUS EXPERIMENTS 
Buehner et al.’s (2003) attempts to test the 
qualitative pattern of causal strengths pre- 
dicted by the power PC theory illustrate a 
modular approach to psychological research. 
This approach attempts to study the mind 
rather than behavior as it happens to be 
observed. It attempts to isolate the influ- 
ence of a mental process under study, even 
though tasks in our everyday life typically in- 
volve confounded contributions from multi- 
ple cognitive processes (e.g., comprehension 
and memory). An analysis of the experimen- 
tal materials in Buehner and Cheng (1997) 
suggests that the deviations from the power 
PC theory are due to factors extraneous to 
the causal inference process (Buehner et al., 
2003). First, the typical dependent variable 
used to measure causal judgments is highly 
ambiguous. Participants are typically asked 
to indicate how strongly they think c causes 
or prevents e. The question may be inter- 
preted to ask how confident one is that c 
causes e, rather than how strongly c causes e. 
Also, it may be interpreted to refer to either 
the current learning context or a counter- 
factual context in which there are no other 
causes. 

Notably, the distal approach allows for 
mulations of coherent answers to each of 


2003; Tenenbaum & Griffiths, 2001). It 
seems plausible that people are capable 
of answering a variety of causal questions. 
Moreover, they may be able to do so coher- 
ently, in which case models of answers to the 
various questions would be complementary 
if they are logically consistent. 

Answers to the various questions (regard- 
ing the same conditions), however, may form 
different patterns. Testing the power PC 
theory directly requires removing both am- 
biguities. To do so, Buehner et al. (2003) 
adopted a counterfactual question: for ex- 
ample, “Imagine 100 patients who do not 
suffer from headaches. How many would 
have headaches if given the medication?” 
To minimize memory demands, Buehner 
et al. presented the trials simultaneously. 
They found that causal ratings using the 
counterfactual question and simultaneous 
trials were perfectly in line with causal 
power as predicted by the power PC the- 
ory. Berry (2003) corroborated Buehner et 
al.’s findings with a nonfrequentist counter- 
factual question. 

Buehner et al. (2003) explained how the 
ambiguity of earlier causal questions can lead 
to confounded results that show an influence 
of AP on conditions with identical causal 
power. However, it cannot account for the 
base rate influence on noncontingent con- 
ditions. But, given the memory demands in 
typical sequential trial experiments, it is in- 
evitable that some participants would er- 
roneously misperceive the contingencies to 
be nonzero, in which case Eqs. (7.6) and 
(7.7) would predict an influence of base rate. 
These equations explain why the mispercep- 
tions do not cancel each other out, as one 
might expect if they were random. Instead, 
for the same absolute amount of misper- 
ception, a positive misperception that oc- 
curs at a higher base rate would imply a 
higher generative power, and a negative mis- 
perception (leading to a negative causal rat- 
ing) that occurs at a lower base rate would 
imply a more negative preventive power. In 
both cases, causal ratings for objectively non- 
contingent candidates would increase as base 
rate increases. Thus, the base-rate influence 
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an interaction between memory and causal 
reasoning. 

Buehner et al. (2003) confirmed this in- 
terpretation in two ways. First, when learn- 
ing trials were presented simultaneously, 
thereby eliminating the possibility of mis- 
perceiving a zero contingency to be nonzero, 
participants no longer exhibited a base rate 
influence in noncontingent conditions. Sec- 
ond, they showed that in an experiment 
involving sequential trials, every judgment 
that deviated from o was indeed traceable 
to the subject’s misperception of the zero 
contingency. All accurately perceived AP of 
o was rated as noncausal. Not a single sub- 
ject did what all nonnormative accounts pre- 
dict — differentially weighing an accurately 
perceived AP of o to result in a nonzero 
causal rating. 

In sum, earlier deviations form the power 
PC theory’s predictions were the result 
of confounding due to comprehension and 
memory processes. Once these extraneous 
problems were curtailed, as motivated by a 
modular approach, causal ratings followed 
exactly the pattern predicted by power PC. 
The complex pattern of results observed 
cannot be accounted for by any current 
associationist model, regardless of how its 
parameters are set. In contrast, the power 
PC theory explains the results without any 
parameters. 


FLEXIBILITY AND COHERENCE 


A general goal of inference is that it 
is both flexible and coherent. We men- 
tioned earlier that a distal approach al- 
lows a coherent formulation of answers to 
different questions. These questions may 
concern confidence in the existence of a 
causal relation (Tenenbaum & Griffiths, 
2001); conjunctive causation (Novick & 
Cheng, 2004); prediction under a change 
in context, enabling conditions rather than 
causes (Cheng & Novick, 1991; Gold- 
varg & Johnson-Laird, 2001); and interven- 
tions (e.g., Cheng, 1997; Gopnik et al., 2004; 
Lagnado & Sloman, 2004; Steyvers et al., 
2003). The approach also provides an expla- 


(Macho & Burkart, 2002). 


Iterative Retrospective Revaluation If an 
equation in several variables characterizes 
the operation of a system, the equation can 
potentially be used flexibly to solve for each 
variable when given the values of other vari- 
ables, and the solutions would all be logi- 
cally consistent. Evidence suggests that the 
equations in the power PC theory are used 
this way. 

Macho and Burkart (2002, Experiment 2) 
presented trials in two phases: In the first, 
two pairs of candidate causes (TC and CD) 
were presented with the outcome e some- 
times occurring, with the same relative fre- 
quency for both combinations; in the sec- 
ond phase, a single disambiguiting candidate, 
D, was presented. Two experimental groups 
differed only with respect to whether e al- 
ways or never occurred with D in the second 
phase. For these groups, despite the fact that 
for both groups T and C were equally absent 
in the critical second phase, the mean causal 
ratings for T were higher than for C in one 
group, but lower in the other group. Con- 
sider what one would infer about T and C 
when D was always accompanied by e in the 
second phase (without D, e did not occur; 
therefore, D causes e). Holding D constantly 
present, because e occurred less often when 
C was there than when it was not, C prevents 
e, and its preventive power can be estimated. 
Instantiating Eq. (7.7) for this design, p, is 
estimable as just mentioned, and P(e | TC) 
is given in phase 1; therefore, P(e | T not- 
C), the only unknown in the equation, can 
be solved. Once this unknown is solved, one 
can next use it to apply Eq. (7.6) to T, which 
has a positive AP: together with the infor 
mation that p(e | not-T not-C) =o given in 
both phases, a positive generative power of 
T results. T and C are therefore generative 
and preventive, respectively. An analogous 
sequence of inference can be made when D 
is never accompanied by eg, resulting in re- 
versed causal powers for T and C (preventive 
and generative, respectively). Associative 
models either cannot predict any retrospec- 
tive revaluation or erroneously predict that 
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tion in phase 2 for each condition, because 
these cues are equally absent in the phase 2 
when their weights are adjusted. 

These results on iterative revaluations 
show that the use of Eqs. (7.6) and (7.7) 
in Cheng’s (1997) power PC theory is more 
flexible than she originally discussed. In her 
paper, she interpreted the causal power vari- 
ables on the left-hand side (LHS) as the de- 
sired unknowns. What Macho and Burkart 
(2002) showed is that, when given the value 
of the variables on the LHS, people are able 
to treat a variable on the right-hand side as 
the desired unknown and solve for it. 


Intervention. The advantage of interven- 
tion over observation is most readily ap- 
preciated when trying to establish which 
of several competing hypotheses underlies 
a complex data structure. We explained this 
abstractly earlier in terms of the likely satis- 
faction of the independent occurrence con- 
dition. Let us non consider as an exam- 
ple the often reported correlation between 
teenage aggression, consumption of violent 
television or movies, and poor school per- 
formance. A correlation between these three 
variables could be due to either a common- 
cause structure: AGGRESSION < TV > 
SCHOOL, where violent television would 
be the cause for both poor school per 
formance and increased aggression, or a 
chain structure: AGGRESSION > TV 
— SCHOOL, where increased aggression 
would lead to increased consumption of vi- 
olent TV, which in turn results in poor 
school performance. Without temporal in- 
formation, these competing causal models 
cannot be distinguished by observation lim- 
ited to the three-node network alone. How- 
ever, if one were to intervene on the TV 
node, the two structures make different 
predictions: According to the former, restric- 
tions on access to violent TV should lead 
to both improved school performance and 
decreased aggression; according to the lat- 
ter, the same restriction would still improve 
school performance but would have no ef- 
fect on aggressive behavior. Note that the 
intervention on TV effectively turned what 


network: The amount of TV is controlled 
by an external agent, which was not rep- 
resented in the simple three-node network. 
When the amount of TV is manipulated un- 
der free will, the external node would oc- 
cur independently of aggression in the causal 
chain structure, because aggression and the 
external agent are alternative causes (of con- 
sumption of violent TV) in the causal chain 
structure, but not in the common cause 
structure. As mentioned earlier, one is likely 
to assume that alternative causes of an out- 
come remain constant while that outcome 
is manipulated under free will. This assump- 
tion, along with the independent occurrence 
condition, together explain why manipula- 
tion allows differentiation between the two 
structures. 


An Enabling Condition. When asked 
“What caused the forest fire?” investigators 
are unlikely to reply, “The oxygen in the air.” 
Rather, they are likely to reserve the title of 
cause to to such factors as “lightning,” “ar- 
son,” or the “dryness of the air.” To explain 
the distinction between causes and enabling 
conditions, a number of theorists argued that 
a causal question invariably implies com- 
putation within a selected set of events 
in which a component cause is constantly 
present (e.g., Mackie, 1974). On this view, 
the forest fire question can be understood as 
“What made the difference between this oc- 
casion in the forest on which there was a fire 
and other occasions in the forest on which 
there was no fire?” Note that the selected 
set of events in the expanded question does 
not include all events in one’s knowledge 
base that are related to fire. In particular, 
it does not include events in which oxygen 
is absent, even though such events (at 
least in an abstract form) are in a typical 
educated person’s knowledge base. The 
power PC theory explains the distinction 
between causes, enabling conditions, and 
irrelevant factors the same way as Cheng 
and Novick (i992) do, except that now 
there is a justification for conditions that 
allow causal inference. A varying candidate 
cause is a cause if it covaries with the target 
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specified by the expanded question, in 
which other causes and causal-factors are 
constant. A candidate cause is an enabling 
condition if it is constantly present in the 
current set of events but is a cause ac- 
cording to another subset of events. Finally, 
a candidate cause is irrelevant if its co- 
variation with the effect is not noticeably 
different from o in any subset of events 
that allows causal inference. (See Gold- 
varg & Johnson-Laird, 2001, for a similar 
explanation.) 


Causal Inference and Category 
Formation: What Is the Level of 
Abstraction at Which AP Should 
Be Computed? 


Cheng (1993) noted the problem of the 
level of abstraction at which covariations 
should be calculated. Consider the problem 
of evaluating whether smoking causes lung 
cancer. The candidate cause “smoking” can 
be viewed at various levels of abstraction, 
for instance, “smoking a particular brand of 
cigarettes” or “inhaling fumes”. If one were 
to compute AP for smoking with respect to 
lung cancer, one would obtain lower values 
for both the narrower and the more abstract 
conceptions of the cause than for “smoking 
cigarettes.” For example, if one adopted the 
more abstract conception “inhaling fumes,” 
P(e | é) would remain unchanged, but one 
would lower P(e |c) because now other 
noncarcinogenic fumes (e.g., steam) contri- 
bute to the estimate of this probability. The 
more abstract exception would result in a 
smaller overall probability of c to produce e. 

Causes and effects (like all events, see 
Vallacher & Wegner, 1987) can be con- 
ceptualized at various levels of abstraction. 
Cheng (1993) hypothesized that to evalu- 
ate a causal relation, people represent the 
relation at the level of abstraction at which 
AP, with alternative causes held constant, is 
maximal. Lien and Cheng (2000) showed 
that people indeed are sensitive to this idea. 
In a completely novel situation, where par- 


ground knowledge (unlike in the smoking/ 
lung cancer example), stimuli varied along 
two dimensions, color and shape, such that 
variations could be described at various lev- 
els of abstraction (e.g., cool vs. warm col- 
ors, red vs. orange, or particular shades of 
red). Participants in Lien and Cheng’s exper- 
iments spontaneously represented the causal 
relation they learned at the level of abstrac- 
tion at which AP was maximal. 

Computing AP at an optimal level is con- 
sistent with an approach to causal learning 
that does not begin with well-defined can- 
didate causes. In contrast, the current de- 
fault assumption in the psychological liter- 
ature is that causal discovery depends on 
the definition of the entities among which 
relations are to be discovered; categoriza- 
tion therefore precedes causal discovery. The 
opposite argument can be made, however. 
Causal discovery could be the driving force 
underlying our mental representation of the 
world — not only in the sense that we need 
to know how things influence each other 
but also in the sense that causal relations 
define what should be considered things in 
our mental universe (Lewis, 1929). Lien and 
Cheng (2000) provided evidence that the 
definition of an entity and the discovery of 
a causal relation operate as a single pro- 
cess in which optimal causal discovery is 
the driving force. Causal discovery therefore 
has direct implications for the formation of 
categories instead of requiring well-defined 
candidate causes as givens. 


Time and Causal Inference: The Time- 
Frame of Covariation Assessment 


We have concentrated on theoretical ap- 
proaches that specify how humans take the 
mental leap from covariation to causation. 
Irrespective of any differences in theoreti- 
cal perspective, all these approaches assume 
covariation can be readily assessed. This as- 
sumption is reflected in the experimental 
paradigms most commonly used. Typically, 
participants are presented with evidence 
structured in the form of discrete, simulta- 
neous, or sequential learning trials in which 
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the cause occurred and whether the effect 
occurred. In other words, in these tasks it is 
always perfectly clear whether a cause is fol- 
lowed by an effect on a given occasion. Such 
tasks grossly oversimplify the complexities 
of causal induction in some situations out- 
side experimental laboratories: Some events 
have immediate outcomes; others do not re- 
veal their consequences until much later. Be- 
fore an organism can evaluate whether a spe- 
cific covariation licenses causal conjecture, 
the covariation needs to be detected and 
parsed in the first place. 

So far, little research effort has been 
directed toward this problem. The scarce 
evidence that exists comes from two very 
different theoretical approaches. One is as- 
sociative learning, and the other is per 
ception of causality. Using an instrumen- 
tal learning paradigm, Shanks, Pearson, and 
Dickinson (1989) instructed participants to 
monitor whether pressing a key caused a tri- 
angle to light up on a computer screen. The 
apparatus was programmed to illuminate the 
triangle 75 % of the time the key was pressed 
and never when the key was not pressed. 
However, participants were also told that 
sometimes the triangle might “light up on its 
own.” This actually never happened in any 
of the experimental conditions but only in a 
set of yoked control conditions during which 
the apparatus played back an outcome pat- 
tern produced in the previous experimental 
condition. In other words, in these control 
conditions, participants’ key presses were 
without any consequences whatsoever. Par- 
ticipants could distinguish reliably between 
experimental and control conditions (i.e., 
they noticed whether their key presses were 
causally effective). However, when Shanks 
et al. inserted a delay between pressing the 
key and the triangle’s illumination, the dis- 
tinction became considerably harder. In fact, 
when the delay was longer than 2 seconds, 
participants could no longer distinguish 
between causal and noncausal conditions, 
even though their key presses were still ef- 
fective 75% of the time. Shanks et al. inter- 
preted this finding as supporting an associa- 
tive account of causal judgment. 


chapter) refers to the instant impression of 
causality that arises from certain stimulus 
displays. The most prominent phenomenon 
is the launching effect. An object A moves 
toward a stationary object B until it col- 
lides with B. Immediately after the colli- 
sion, B moves along the same trajectory as A, 
while A becomes stationary. Nearly all per 
ceivers report that such displays look as if A 
“launched” B or “made B move” (Michotte, 
1946/1963; for a recent overview, see Scholl 
& Tremoulet, 2000). However, if a temporal 
gap of more than 150 ms is inserted between 
the collision of A and B and the onset of 
B’s motion, the impression of causality dis- 
appears and observers report two distinct, 
unrelated motions. 

From a computational perspective, it is 
easy to see why delays would produce decre- 
ments in causal reasoning performance. 
Contiguous event pairings are less demand- 
ing on attention and memory. They are also 
much easier to parse. When there is a tem- 
poral delay and there are no constraints on 
how the potential causes and effects are 
bundled, as in Shanks et al. (1989), the basic 
question on which contingency depends no 
longer has a clear answer: Should this par- 
ticular instance of e be classified as occur- 
ring in the presence of ¢ or in its absence? 
Each possible value of temporal lag results in 
a different value of contingency. The prob- 
lem is analogous to that of the possible lev- 
els of abstractions of the candidate causes 
and the effects at which to evaluate contin- 
gency (and may have an analogous solution). 
Moreover, for a given e, when alternative in- 
tervening events occur, the number of hy- 
potheses to be considered multiply. The re- 
sult is a harder, more complex inferential 
problem — one with a larger search space. 
One might think that keeping track of out- 
come rates and changes in these rates condi- 
tional on the presence and absence of other 
events would solve the problem (Gallistel & 
Gibbon, 2000). Measuring outcome rates, 
however, would not help in Shanks et al.’s 
(1989) situation. Unless there are additional 
constraints (e.g., discrete entities in which c 
may or may not occur at any moment, but 
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ered “present” for that entity, even when it 
is no longer occurring), the parsing problem 
remains, as does the proliferation of candi- 
date causes that precede an outcome. 

Until now, we have focused on situations 
in which there is no prior causal knowl- 
edge. We digress here to discuss a case in 
which there is such knowledge. When the 
search space is large, constraints provided 
by prior knowledge of types of causal re- 
lations become increasingly important. As- 
sessing maximal covariation among the set 
of hypotheses may be impractical given 
the large search space, or at least ineff- 
cient given the existence of prior knowl- 
edge. When there is prior knowledge, why 
not use it? Some evidence suggests, however, 
that children are unable to integrate prior 
temporal knowledge with frequency obser- 
vations. Schlottmann (i999) showed that 
5- to 7-year-old children, although able to 
learn about and understand delayed causal 
mechanisms perfectly, when presented with 
a choice between a delayed and immedi- 
ate cause, always preferred the immediate, 
contiguous cue, even when they explicitly 
knew that the causal relation in question in- 
volved a delay. Schlottmann interpreted her 
findings to indicate that temporal contigu- 
ity is a powerful cue to causality. Because 
young children fail to integrate two kinds of 
evidence (knowledge of a delayed mecha- 
nism and contingency evaluated at the hy- 
pothesized delay), they discard the knowl- 
edge cue and focus exclusively on temporal 
contiguity. 

Adult reasoners, in contrast, can most 
likely integrate the two kinds of evidence. 
If the reasoner anticipates that a causal re- 
lation might involve a delay, its discovery 
and assessment should be considerably eas- 
ier. According to Einhorn and Hogarth’s 
(1986) knowledge mediation hypothesis, peo- 
ple make use of their prior causal knowledge 
about the expected length of the delay to 
reduce the complexity of the inference prob- 
lem. They focus on the expected delay for 
a type of causal relation and evaluate ob- 
servations with respect to it. In Bayesian 
terms, they evaluate likelihoods, the prob- 


hypothesis. Both Shanks et al.’s (1989) and 
Michotte’s (1946/1963) findings are consis- 
tent with Einhorn and Hogarth’s (1986) hy- 
pothesis. However, these findings cannot be 
cited as unequivocally demonstrating that 
adults use prior causal knowledge as a ba- 
sis for event parsing because the inductive 
problem gets increasingly difficult as the de- 
lay increases, and an account based on prob- 
lem difficulty alone would predict the same 
qualitative pattern of results. 

Hagmayer and Waldmann (2002) showed 
that people use prior knowledge of tempo- 
ral intervals in causal relations to classify ev- 
idence about the presence and absence of c 
and e in continuous time accordingly. Par- 
ticipants in their Experiment 1 were pre- 
sented with longitudinal information con- 
cerning the occurrence of mosquito plagues 
over a 20-year period in two adjacent com- 
munities. They were told that one com- 
munity relied on insecticides, whereas the 
other employed biological means (planting 
a flower that mosquito larvae-eating beetles 
need to breed). Although the instructions 
never mentioned the time frame of the 
causal mechanisms in question explic- 
itly, Hagmayer and Waldmann assumed 
the insecticide instructions would create 
expectations of immediate causal agency, 
whereas mentioning the biological mecha- 
nism would create expectation of a delay. 
Data were presented in tabular form show- 
ing for each of the 20 years whether the in- 
tervention had taken place (insecticide de- 
livered, plants planted) and whether there 
was a plague in that year. The data were con- 
structed to yield a moderately negative con- 
tingency between intervention and plague 
when considered within the same year but a 
positive contingency when considered over 
a1-year delay. Participants’ evaluation of the 
same covariational data varied as a function 
of the instructions in line with a knowledge- 
mediation account. These results illustrate 
that people in principle can and do use tem- 
poral knowledge to structure evidence into 
meaningful units. 

Buehner and May (2002, 2003, 2004) fur- 
ther showed that adults are able to reduce 
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on causal relations elapsing in real time also 
by making use of expectations about the 
time frame of the causal relation in question. 
Buehner and May instructed participants at 
the outset of the experiment about potential 
delays. They did this in a number of ways and 
found that both explicit and implicit instruc- 
tions about potential delays improved the 
assessment of delayed causal relationships. 
The use of prior temporal knowledge raises 
the question of how that knowledge might 
have been acquired. Causal discovery with- 
out prior temporal knowledge may be diffi- 
cult (e.g., longitudinal studies are expensive, 
even though they have more constraints for 
limiting the search space than in Shanks et 
al.’s situation), but it is possible given com- 
putational resources. 


Summary and Future Directions 


Our chapter has taken a computational per- 
spective — in particular, one of construct- 
ing an artificial intelligence system capable 
of causal learning given the types of non- 
causal observations available to the system. 
We have reviewed arguments and empirical 
results showing that an approach that inter- 
prets observable events in terms of a hypo- 
thetical causal framework explains why co- 
variation need not imply causation and how 
one can go beyond predicting future obser- 
vations to predicting the consequences of 
interventions. An additional appeal of this 
approach is that it allows one to address 
multiple research questions within a coher- 
ent framework. We compared this frame- 
work with an associative framework in our 
review of previous theoretical and empirical 
research, which focused on the estimation 
of causal strength. There are many other in- 
teresting causal questions that remain to be 
addressed under this framework. Some of 
these are 


* How do people evaluate their confi- 
dence in whether a causal relation exists? 
Tenenbaum and Griffiths (2001) pro- 


evaluate it. The assessment of confidence, 
unlike causal strength, can give rise to the 
observed gradual acquisition curves. 


How do people evaluate how much an 
outcome that is known to have occurred 
is attributable to a candidate cause (see 
Ellsworth, Chap. 28; Spellman, 2000)? 
This issue is important in legal decision 
making. Can a causal power approach 
overcome the difficulty in cases involving 
overdetermination? 

What determines the formation of the 
categories? Does it matter whether the 
variables are linked by a causal relation? 
What demarcates an event given that 
events occur in continuous time? Does a 
new category form in parallel as a new 
causal relation is inferred? 

What determines the level of abstraction 
at which a causal relation is inferred? 
What determines the choice of the tem- 
poral interval between a cause and an ef- 
fect for probabilistic causal relations? 


Do people make use of prior causal 
knowledge in a Bayesian way (Tenen- 
baum & Griffiths, 2002)? Are various 
kinds of prior causal knowledge (eg., 
temporal, mechanistic) integrated with 
current information in the same way? 
What role, if any, does coherence play? 
All models of causal learning in principle 
allow the use of prior causal knowledge, 
regardless of whether they are Bayesian. 
If a comparison among these models in- 
volves a situation in which the reasoner 
has prior knowledge, then the default as- 
sumption would be to equate the input 
to the models, for example, by supply- 
ing the data on which prior causal knowl- 
edge is based in the input supplied to 
the non-Bayesian models. They would 
not be alternative models with respect 
to the last two questions, for example. 
It seems to us that including the use of 
prior knowledge would not make a differ- 
ence at Marr’s computational level with 
respect to the issue of what is computed 
in the process but would concern issues of 
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models, there is explicit representa- 
tion of the prior probability of a causal 
hypothesis. 

e Are people able to make use of patterns 
of conditional independence as Bayesian 
network models do (Gopnik et al., 2004) 
to infer entire causal networks, rather 
than infer individual causal relations link 
by link as assumed by most current asso- 
ciative and causal accounts? 
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Note 


1. Ulrike Hahn provided this interpretation. 
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CHAPTER 8 


Deductive Reasoning 


Jonathan St. B. T. Evans 


The study of deductive reasoning has been a 
major field of cognitive psychology for the 
past 40 years or so (Evans, 2002; Evans, 
Newstead, & Byrne, 1993; Manktelow, 
1999). The field has its origins in philosophy, 
within the ancient discipline of logic, and 
reflects the once influential view known as 
logicism in which logic is proposed to be the 
basis for rational human thinking. This view 
was prevalent in the 1960s when psycho- 
logical study of deductive reasoning became 
an established field in psychology, espe- 
cially reflecting the theories of the great de- 
velopmental psychologist Jean Piaget (eg., 
Inhelder & Piaget, 1958). Logicism was 
also influentially promoted to psychologists 
studying reasoning in a famous paper by 
Henle (1962). At this time, rationality was 
clearly tied to logicality. 

So what exactly is deductive logic? (See 
Sloman & Lagnado, Chap. 5, for a contrast 
with induction.) As a model for human rea- 
soning, it has one great strength but several 
serious weaknesses. The strength is that an 
argument deemed valid in logic guarantees 
that if the premises are true, then the conclu- 


sion will also be true. Consider a syllogism 
(an old form of logic devised by Aristotle) 
with the following form: 


All C are B. 
No A are B. 
Therefore, no A are C. 


This is valid argument and will remain so no 
matter what terms we substitute for A, B, 
and C. For example, 


All frogs are reptiles. 


No cats are reptiles. 
Therefore, no cats are frogs. 


has two true premises and a true conclusion. 
Unfortunately, the argument is equally valid 
if we substitute terms as follows: 


All frogs are mammals. 

No cats are mammals. 

Therefore, no cats are frogs. 
A valid argument can allow a true conclu- 
sion to be drawn from false premises, as pre- 


viously, which would make it seem a non- 
sense to most ordinary people (that is, not 
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describing everyday reasoning, but there are 
others. The main limitation is that deduc- 
tive reasoning does not allow you to learn 
anything new at all because all logical argu- 
ment depends on assumptions or supposi- 
tions. At best, deduction may enable you to 
draw out conclusions that were only implicit 
in your beliefs, but it cannot add to those 
beliefs. There are also severe limitations 
in applying logic to real world arguments 
where premises are uncertain and conclu- 
sions may be made provisionally and later 
withdrawn (Evans & Over, 1996; Oaksford & 
Chater, 1998). 

Although these limitations are nowa- 
days widely recognized, the ability of peo- 
ple to reason logically (or the lack of it) 
was considered an important enough is- 
sue in the past for the use of the deduc- 
tion paradigm to become well established. 
The standard paradigm consists of giving 
people premises and asking them to draw 
conclusions. There are two key instructions 
that make this a deductive reasoning task. 
First, people must be told to assume the 
premises are true and (usually) are told to 
base their reasoning only on these premises. 
Second, they must only draw or endorse 
a conclusion that necessarily follows from 
the premises. 

An example of a large deductive reason- 
ing study was that more recently reported by 
Evans, Handley, Harper, and Johnson-Laird 
(1999) using syllogistic reasoning. Syllogisms 
have four kinds of statement as follows: 


Universal All A are B. 
Particular Some A are B. 
Negative universal No A are B. 


Negative particular Some A are not B. 


Because a syllogism comprises two premises 
and a conclusion, there are 64 possible moods 
in which each of the three statements can 
take each of the four forms. In addition, 
there are four figures produced by chang- 
ing the order of reference to the three linked 
terms, A, B, and C, making 256 logically 
distinct syllogisms. For example, the fol- 
lowing syllogisms have the same mood but 
different figures: 


(2) No Care B. (2) 


Some A are B. Some B are A. 


Therefore, Therefore, 
some A are some C are 
not C. not A. 


Although these arguments look very simi- 
lar, (1) is logically valid and (2) is invalid. 
Like most invalid arguments, the conclusion 
to (2) is possible given the premises, but not 
necessary. Hence, it is a fallacy. Here is a case 
in which a syllogism in form (2) seems per- 
suasive because it has true premises and a 
true conclusion: 


No voters are under 18 years of age. 
Some film stars are under 18 years of age. 
Therefore, some voters are not film stars. 


However, we can easily construct a coun- 
terexample case. A counterexample proves 
an argument to be invalid by showing that 
you could have true premises but a false con- 
clusion, such as 


No bees are carnivores. 
Some animals are carnivores. 
Therefore, some bees are not animals. 


Evans et al. (1999) actually gave partici- 
pants all 64 possible combinations of syllo- 
gistic premises and asked them to decide in 
one group whether each of the four possible 
conclusions followed necessarily from these 
premises in line with standard deductive rea- 
soning instructions (in this study, all problem 
materials were abstract, using capital letters 
for the terms). A relatively small number of 
syllogisms have necessary (valid) conclusions 
or impossible (determinately false) conclu- 
sions. Most participants accepted the former 
and rejected the latter in accord with logic. 
The interesting cases are the potential falla- 
cies like (2), where the conclusion could be 
true but does not have to be. In accordance 
with previous research, Evans et al. found 
that fallacies were frequently endorsed, al- 
though with an interesting qualification to 
which we return. They ran a second group 
who were instructed to endorse conclusions 
that could be true (that is possible) given 
their premises. The results suggested that 
ordinary people have a poor understanding 
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should have selectively increased acceptance 
of conclusions normally marked as fallacies. 
In fact, participants in the possibility groups 
accepted conclusions of all kinds more fre- 
quently, regardless of the logical argument. 


Rule- Versus Model-Based Accounts 
of Reasoning 


Logical systems can be described using a syn- 
tactic or semantic approach, and psycholog- 
ical theories of deductive reasoning can be 
similarly divided. In the syntactic approach, 
reasoning is described using a set of abstract 
inference rules that can be applied in se- 
quence. The approach is algebraic in that 
one must start by recovering the logical form 
of an argument and discarding the particu- 
lar content or context in which it is framed. 
In standard propositional logic, for example, 
several inference rules are applied to con- 
ditional statements of the form if p then q. 
These rules can be derived from first prin- 
ciples of the logic and provide a short-cut 
method of deductive reasoning. Here are 
some examples: 


Modus Ponens (MP) Modus Tollens (MT) 
If p then q If p then q 

Pp not-q 

Therefore q Therefore, not-p 


For example, suppose we know that “if 
the switch is down then the light is on.” If 
I notice that the switch is down, then I can 
obviously deduce that the light is on (MP). 
If I see that the light is off I can also validly 
infer that the switch is not down (MT). One 
of the difficulties with testing people’s logi- 
cal ability with such arguments, however, is 
that they can easily imagine counterexample 
cases that block such valid inferences (Evans 
et al., 1993). For example, if the light bulb 
has burned out, neither MP not MT will de- 
liver a true conclusion. That is why the in- 
struction to assume the truth of the premises 
should be part of the deduction experiment. 
It also shows why deductive logic may have 
limited application in real world reasoning, 


ments — do have exceptions. 

Some more complex rules involve suppo- 
sitions. In suppositional reasoning, you add 
a temporary assumption to those given that 
is later deleted. An example is conditional 
proof (CP), which states that if by assum- 
ing p you can derive q, then it follows that 
if p then q, a conclusion that no longer de- 
pends on the assumption of p. Suppose the 
following information is given: 


If the car is green, then it has four-wheel 
drive. 

The car has either four-wheel drive or 
power steering, but not both. 


What can you conclude? If you make the 
supposition that the car is in fact green, then 
you can draw the conclusion, in two steps, 
that it does not have power steering. Now 
you do not know if the car is actually green, 
but the CP rule allows you to draw the con- 
clusion, “If the car is green then it does not 
have power steering.” 

Some philosophers described inference 
rule systems as “natural logics,” reflecting the 
idea that ordinary people reason by apply- 
ing such rules. This has been developed by 
modern psychologists into sophisticated psy- 
chological theories of rule-based reasoning, 
often described as “mental logics.” The best- 
developed systems are those of Rips (1994) 
and Braine and O’Brien (1998). According 
to these accounts, people reason by abstract- 
ing the underlying logical structure of argu- 
ments and then applying inference rules. Di- 
rect rules of inferences, such as MP, are ap- 
plied immediately and effortlessly. Indirect, 
suppositional rules such as CP are more dif- 
ficult and error prone. Although MT is in- 
cluded as a standard rule in propositional 
logic, mental logicians do not include this 
as a direct rule of inference for the simple 
reason that people find it difficult. Here is 
an MT argument: 


If the card has an A on the left, then it has 
a3 on the right. 


The card does not have a 3 on the right. 
Therefore, the card does not have an A on 


the left. 
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First premise Second Conclusion 
Possibility if A then3 premise not-3 not-A 


A, 3 True False False 

A, not-3 False True False 

Not-A, 3 True False True 

Not-A, True True True 
not-3 


Whereas MP is made nearly 100% of the 
time with such abstract materials, MT rates 
are quite variable but typically around 70% 
to 75% (Evans et al., 1993). Mental logicians 
therefore propose that it depends on an indi- 
rect suppositional rule known as reductio ad 
absurdum (RAA). This rule states that if a 
supposition leads to a contradiction, then the 
negation of the supposition is a valid conclu- 
sion. With the previous, we make the sup- 
position that the card has an A on the left. 
Hence, it follows that there is a 3 on the 
right (MP). However, we are told that there 
is not a3 on the right, which gives us a con- 
tradiction. Contradictions are not logically 
possible, and so the supposition from which 
it followed must be false. Hence, the conclu- 
sion given must be true. 

A powerful rival account of deductive 
reasoning is given by the mental model the- 
ory (Johnson-Laird, 1983; Johnson-Laird & 
Byrne, 1991, 2002; see Johnson-Laird, Chap. 
9), which is based on the semantic logical 
approach. The semantic method proves ar- 
guments by examining logical possibilities. 
In this approach, for example, the previous 
MT argument could be proved by truth table 
analysis. This involves writing down a line in 
the truth table for each possibility and evalu- 
ating both premises and conclusions. An ar- 
gument is valid if there is not a line in the 
table where the premises are true and the 
conclusion false. A truth table analysis for 
the previous argument is shown in Table 8.1. 

It should be noted that the previous anal- 
ysis, in accord with standard propositional 
logic, assumes the conditional statement “if 
p then q” conveys a logical relationship 
called material implication. Severe doubts 
have been expressed in both the philosoph- 
ical and psychological literatures that the 
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could be a material conditional (Edgington, 
1995; Evans, Handley, & Over, 2003; Evans 
& Over, 2004). However, this distinction 
does not affect the validity of the arguments 
discussed here. In the previous example, be- 
cause there is no case in which true premises 
can lead to a false conclusion, the argu- 
ment is valid. Let us contrast this with one 
of the classical fallacies of conditional rea- 
soning known as affirmation of the conse- 
quent (AC). Suppose we are tempted to ar- 
gue from the previous conditional that if the 
letter on the right is known to be a 3, then 
the letter on the left must be an A. See 
Table 8.2 for the truth table. 

The analysis exposes the argument as a 
fallacy because there is a state of affairs — 
a card that does not have an A on the left 
but has a 3 on the right — in which the 
premises would both be true but the con- 
clusion false. 

Just as the mental logic approaches do not 
simply adopt the inference rules of standard 
logic to account for human reasoning, so the 
mental models approach does not endorse 
truth table analysis either (Johnson-Laird, 
Byrne, 1991; 2002). Mental models do repre- 
sent logical possibilities, but the model the- 
ory adds psychological proposals about how 
people construct and reason with such mod- 
els. First, according to the principle of truth, 
people normally represent only true possi- 
bilities. Hence, the theory proposes that the 
full meaning of a “basic conditional” is the 
explicit set of true possibilities: 


{pq,7pq, =p-q} 


where — means “not.” Second, owing to 
working memory limitations, people form 


Table 8.2. Truth Table Analysis 


First premise Second Conclusion 
Possibility if Athen3  premise3 A 


A, 3 True True True 

A, not-3 False False True 

Not-A, 3 True True False 

Not-A, True False False 
not-3 
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conditional if p then q is normally repre- 
sented as 


[pla 


¢ , 


where “...” is a mental footnote to the ef- 
fect that there may be other possibilities, al- 
though they are not explicitly represented. 
Like the mental logic theory, mental model 
theory gives an account of why MP is eas- 
ier than MT. The square brackets around 
p in the model for the pq possibility indi- 
cate that p is exhaustively represented with 
respect to q (that is, it must be present in 
all models that include q). Hence, when the 
premise p is presented, there is no need to 
flesh out any other possibilities and the con- 
clusion q can be drawn right away (MP). 
When the MT argument is presented, how- 
ever, the second premise is not-g, which 
is not represented in any explicit model. 
Consequently, some people will say that 
“nothing follows.” 

Successful MT reasoners, according to 
this theory, flesh out the explicit models for 
the conditional: 


pq 

Pq 

—p~4 
The second premise eliminates the first two 
models, leaving only the possibility ~p-q. 
Hence, the conclusion not-p must follow. 
With regard to the MT problem presented 
earlier, this means that people must decide 
that if there is not a 3 on right of the card, 
the only possibility consistent with the con- 
ditional is that the card does not have an A 

on the left either. 

The model theory was originally devel- 
oped to account for syllogistic reasoning of 
the kind considered earlier (Johnson-Laird & 
Bara, 1984). In this version, it was argued 
that people formed a model of the premises 
and formulated a provisional conclusion 
consistent with this model. It was further 
proposed that people made an effort at de- 
duction by searching for a counterexample 
case, that is, a model that agrees with the 


involves the same semantic principle as truth 
table analysis: An argument is valid if there 
is no counterexample to it in which the 
premises hold and the conclusion does not. 
Although this accounts for deductive com- 
petence, the main finding on syllogistic rea- 
soning is that people in fact endorse many 
fallacies. By analyzing the nature of the fal- 
lacies that people make and those they avoid, 
Evans et al. (1999) were able to provide 
strong evidence that people do not normally 
search for counterexample cases during syl- 
logistic reasoning. Some fallacies are made 
as frequently as valid inferences and some as 
infrequently as on syllogisms where the con- 
clusion is impossible. This strongly suggests 
that people consider only a single model of 
the premises, endorsing the fallacy if this 
model happens to include the conclusion. 
This issue has also been addressed in more 
recent papers by Newstead, Handley, and 
Buck (i999) and by Bucciarelli & Johnson- 
Laird (1999). 

Both the mental logic and mental mod- 
els theories described here provide abstract, 
general-purpose systems that can account 
for human deductive competence across any 
domain, but that also allow for error. There 
has been a protracted — and in my view, in- 
conclusive — debate between advocates of 
the two theories with many claims and coun- 
terclaims that one side or the other had 
found decisive empirical evidence (for re- 
view and discussion, see Evans et al., 1993, 
Chap. 3; Evans & Over, 1996, 1997). It is 
important to note that these two theories 
by no means exhaust the major theoreti- 
cal attempts to account for the findings in 
reasoning experiments, although other the- 
orists are less concerned with providing a 
general account of deductive competence. 
Other approaches include theories framed in 
terms of content-specific rules such as prag- 
matic reasoning schemas (Cheng & Holyoak, 
1985; Holyoak & Cheng, 1995) or Dar 
winian algorithms (Cosmides, 1989; Fiddick, 
Cosmides, & Tooby, 2000), which were de- 
signed to account for content and context 
effects in reasoning discussed in the next sec- 
tion. The heuristic-analytic theory of Evans 
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count of biases in deductive reasoning tasks 
to which we now turn. 


Biases in Deductive Reasoning 


I have already mentioned that people are 
very prone to making fallacies in syllogis- 
tic reasoning and that they do not always 
succeed in drawing valid inferences such as 
MT in conditional reasoning. In fact, peo- 
ple make many logical errors generally on 
deductive reasoning tasks. These errors are 
not necessarily random but often system- 
atic, leading to description by term bias. We 
should note at this point that a bias is by 
definition a regular deviation from the logic 
norm and defer for the time being the ques- 
tion of whether biases should be taken to 
indicate irrationality. 

One of the earliest known biases in con- 
ditional reasoning was that of “negative con- 
clusion bias” (Evans, 1982), which affects 
several conditional inferences, including MT 
(Schroyens, Schaeken, & d’Ydewalle, 2001). 
I gave an example of an MT inference earlier, 
with an affirmative conditional statement, 
and said that people solve this about 75% of 
the time. Consider a subtly changed version 
of the earlier problem: 


If the card does not have an A on the left, 
then it has a 3 on the right. 


The card does not have a 3 on the right. 
Therefore, the card has an A on the left. 


The difference is that a negative has been 
introduced into the first part of the condi- 
tional and the conclusion is now affirmative. 
This argument is still MT and valid, but now 
only around 40% to 50% of the time do peo- 
ple succeed in making it — a very large and 
reliable difference across many studies. The 
most likely account of this bias is a double 
negation effect. Reasoning by RAA on the 
previous problem will, following discovery 
of the contradiction, lead one to conclude 
that the supposition that the card does not 
have an A on the left must be false. How- 
ever, this is a double negative from which 


A must be on the left. The double nega- 
tion effect can also be given an interpre- 
tation within mental model theory (Evans, 
Clibbens, & Rood, 1995). 

Introducing negatives into conditional 
statements can also cause an effect known 
as matching bias (Evans, 1998). This is best 
illustrated in a problem known as the Wason 
selection task (Wason, 1966). Although not 
strictly a deductive reasoning task, the selec- 
tion task involves the logic of conditionals 
and is considered part of the literature on 
the deduction. In a typical abstract version 
of the problem, participants are shown four 
cards lying on a table and told that each has 
a capital letter on one side and a single figure 
number on the other. The visible sides are 


B L 2 9 


They are told that the following rule ap- 
plies to these four cards and may be true or 
false: 


If a card has a B on one side, then it has a 
2 on the other side. 


The task is to decide which cards need to be 
turned over in order to check whether the 
rule is true or false. Wason argued that the 
correct choice is B and 9 because only a card 
with a B on one side and anumber other than 
2 on the other side could disprove the rule. 
Most subsequent researchers have accepted 
this normative analysis, although some ar- 
gue against it on the assumption that people 
interpret the task as having to do with cate- 
gories rather than specific cards (Oaksford & 
Chater, 1994). In any event, only around 
10% of university students typically choose 
the B and 9. The most common choices are 
B and 2, or just B. Wason originally argued 
that this provided evidence of a confirmation 
bias in reasoning (Wason & Johnson-Laird, 
1972). That is, participants were trying to 
discover the confirming combination of B 
and 2 rather than the disconfirming combi- 
nation of B and 9. 

Wason later abandoned this account, 
however, in light of the evidence of Evans 
and Lynch (1973). These authors argued that 
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cards are also the matching cards in other 
words, those that match the values specified 
in the rule. By introducing negative compo- 
nents, it is possible to separate the two ac- 
counts. For example, suppose the rule was 


If a card has a B on one side, then it does 
NOT have a2 on the other side. 


Now the matching choice of B and 2 is 
also the correct choice because a card with a 
B on one side and a2 on the other side could 
disprove the rule. Nearly everyone gets the 
task right with this version — a curious case 
of a negative making things a lot easier. In 
fact, when the presence of negatives is sys- 
tematically rotated, the pattern of findings 
strongly supports matching bias in both the 
Evans and Lynch (1973) study and a number 
of replication experiments reported later in 
the literature (Evans, 1998). 

What then is the cause of this match- 
ing bias? There is strong evidence that it 
reflects difficulty in processing implicit nega- 
tion. Evans, Clibbens, and Rood (1996) pre- 
sented descriptions of the cards in place of 
the actual cards. In the materials of the ex- 
ample given previously, their descriptions 
for an implicit and explicit negation group 
were as follows: 


Implicit negation 

The letter on the 
card is a B. 

The letter on the 
card is an L. 

The number on the 
card is a2. 

The number on the 
card is a 9. 


Explicit negation 
The letter on the 
card is a B. 

The letter on the 
card is not a B. 
The number on the 

card is a2. 
The number on the 
card is nota 9. 


The presence of negations was also var- 
ied in the conditionals in order to provide 
the standard method of testing for matching 
bias. Whereas the implicit negation group 
showed normal strong matching bias, there 
was no matching bias at all in the explicit 
negation group. However, this group did not 
perform more logically. They simply picked 
more of the mismatching cards that would 
normally have been suppressed, regardless 


Of course, in the explicit negation group, 
the negative cases really still match because 
they refer to the letter and number in the 
conditional statement. In spite of this strong 
evidence, an alternative theory of match- 
ing bias has been promoted by Oaksford 
and Chater (1994) based on expected in- 
formation gain (negative statements con- 
vey less information). Yama (2001) more 
recently reported experiments trying to sep- 
arate the two accounts with somewhat ambi- 
valent findings. 

One of the most important biases inves- 
tigated in the deductive reasoning literature 
is the belief bias effect, which is typically 
but inaccurately described as a tendency to 
endorse the validity of arguments when you 
agree with their conclusions. I consider the 
belief bias effect in the following section on 
content and context effects. First, I briefly 
discuss the implications of reasoning biases 
for the debate about human rationality. Co- 
hen (1981) was one of the first critics to 
launch an attack on research in this field, 
as well as the related “heuristic and biases” 
program of work on probability judgment 
(Gilovich, Griffin, & Kahneman, 2002; Kah- 
neman, Slovic, & Tversky, 1982; see Kahne- 
man & Frederick, Chap. 12). Cohen argued 
that evidence of error and bias in experi- 
ments on reasoning and judgment should not 
be taken as evidence of human irrationality. 
Cohen’s arguments fall into three categories 
that have also been reflected in writings of 
subsequent authors: the normative system 
problem, the interpretation problem, and the 
external validity problem (Evans, 1993). 

The first issue is that people can only be 
judged to be in error relative to some norma- 
tive system that may well be disputable. For 
example, philosophers have proposed alter- 
native logics, and the standard propositional 
logic for deductive reasoning can be seen 
as mapping poorly to real world reasoning, 
which allows for uncertainty and the with- 
drawal of inferences in light of new evidence 
(Evans & Over, 1996; Oaksford & Chater, 
1998). The interpretation problem is that 
correctness of inference is judged on the as- 
sumption that the participant understands 
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is also a pertinent criticism. As I (Evans, 
2002, p. 991) previously put it: 


The interpretation problem is a very seri- 
ous one indeed for traditional users of the 
deduction paradigm who wish to assess log- 
ical accuracy. To pass muster, participants 
are required not only to disregard prob- 
lem content but also any prior beliefs they 
have relevant to it. They must translate the 
problem into a logical representation using 
the interpretation of key terms that accord 
with a textbook (not supplied) of standard 
logic... whilst disregarding the meaning of 
the same terms in everyday discourse. 


The external validity argument is that 
the demonstration of cognitive biases and il- 
lusions in the psychological laboratory does 
not necessarily tell us anything about the real 
world. This one I have much less sympa- 
thy with. The laws of psychology apply in 
the laboratory, as well as everywhere else, 
and many of the biases that have been dis- 
covered have been shown to also affect ex- 
pert groups. For example, base rate neglect 
in statistical reasoning has been shown many 
times in medical and other expert groups 
(Koehler, 1996), and there are numerous 
real world studies of heuristics and biases 
(Fischhoff, 2002). 

One way of dealing with the normative 
system problem is to distinguish between 
normative and personal rationality (Ander- 
son, 1990; Evans & Over, 1996). Logical er- 
rors on deductive reasoning tasks violate nor- 
mative rationality because the instructions 
require one to assume the premises and draw 
necessary conclusions. Whether they violate 
personal rationality is moot, however, be- 
cause we may have little use for deductive 
reasoning in everyday life and carry over in- 
appropriate but normally useful procedures 
instead (Evans & Over, 1996). A different 
distinction is that between individual and 
evolutionary rationality (Stanovich, 1999; 
Stanovich & West, 2000, 2003). Stanovich 
argues that what serves the interests of the 
genes does not always serve the interests of 
the individual. In particular, the tendency 
to contextualize all problems against back- 


section) may prevent us from the kind of ab- 
stract reasoning that is needed in a modern 
technological society, so different from the 
world in which we evolved. 


Content and Context Effects 


Once thematic materials are introduced into 
deductive reasoning experiments, especially 
when some kind of context — however min- 
imal — is given, participants’ responses be- 
come heavily influenced by pragmatic fac- 
tors. This has led paradoxically to claims 
both that familiar problem content can facil- 
itate logical reasoning and that such familiar- 
ity can be cause of bias! The task on which 
facilitation is usually claimed is the deontic 
selection task that we examine first. 


The Deontic Selection Task 


It has been known for many years that “re- 
alistic” versions of the Wason selection task 
can facilitate correct card choices, although 
it was not immediately realized that most of 
these versions change the logic of the task 
from one of indicative reasoning to one of 
deontic reasoning. An indicative conditional, 
of the type used in the standard abstract task 
discussed earlier, makes an assertion about 
the state of the world that may be true or 
false. Deontic conditionals concern rules and 
regulations and are often phrased using the 
terms “may” or “must,” although these may 
be implicit. A rule such as “if you are driv- 
ing on the highway then you must keep your 
speed under 70 mph” cannot be true or false. 
It may or may not be in force, and it may or 
may not be obeyed. 

A good example of a facilitatory ver- 
sion of the selection task is the drinking 
age problem (Griggs & Cox, 1982). Partici- 
pants are told to imagine that they are police 
officers observing people drinking in a bar 
and making sure that they comply with the 
following law: 


If a person is drinking in a bar, then that 
person must be over 19 years of age 
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population group is being presented with the 
task and normally corresponds to the local 
law it knows.) They are told that each card 
represents a drinker and has on one side the 
beverage being drunk and on the other side 
the age of the drinker. The visible sides of 
the four cards show: 


Drinking Drinking 
beer coke 


16 years 
of age 


22 years 
of age 


The standard instruction is to choose 
those cards that could show that the rule 
is being violated. The correct choice is the 
drinking beer and 16 year old, and most peo- 
ple choose this. Compared with the abstract 
task, it is very easy. However, the task has 
not simply been made realistic. It is a deontic 
task and one in which the context makes not 
only the importance of violation salient but 
also makes it very easy to identify the violat- 
ing case. There have been many replications 
and variations of such tasks (see Evans et al., 
1993, and Manktelow, 1999, for reviews). It 
has been established that real world knowl- 
edge of the actual rule is not necessary to 
achieve facilitation (see, for example, Cheng 
& Holyoak, 1985). Rules that express per- 
mission or obligation relationships in plau- 
sible settings usually lead people to the ap- 
propriate card choices. 

Most of the elements of presentation of 
the drinking age problem as originally de- 
vised by Griggs and Cox need to be in place, 
however. Removing the deontic orientation 
of the violation instructions greatly weak- 
ens the effect (see Evans et al., 1993), and 
removing the minimal context about the 
police officer blocks most of the facilita- 
tion (Pollard & Evans, 1987). Hence, it is 
important to evoke pragmatic processes of 
some kind that introduce prior knowledge 
into the reasoning process. These factors 
can override the actual syntax of the condi- 
tional rule. Several authors discovered inde- 
pendently that the perspective given to the 
participant in the scenario can change card 
choices (Gigerenzer & Hug, 1992; Mank- 
telow & Over, 1991; Politzer & Nguyen- 
Xuan, 1992). For example, imagine that a 
big department store, struggling for business, 


If a customer spends more than $100, 
then he or she may take a free gift. 


The four cards represent customers showing 
the amount spent on one side and whether 
they received a gift on the other: “spent 
$120,” “spent $75,” “received gift,” “did not 
take gift.” If participants are given the per 
spective of a store detective looking for 
cheating customers, they turn over cards 2 
and 3 because a cheater would be taking the 
gift without spending $100. If they are given 
the perspective of a customer checking that 
the store is keeping its promise, however, 
they turn cards 1 and 4 because a cheating 
store would not provide the gift to customers 
who spent the required amount. 

There are several theoretical accounts of 
the deontic selection task in the literature. 
One of the earliest was the pragmatic rea- 
soning schema theory of Cheng and Holyoak 
(1985). These authors proposed that peo- 
ple retrieve and apply a permission schema 
comprising a set of production rules. For ex- 
ample, on the drinking age problem, you 
need to fulfil the precondition of being older 
than 19 years of age in order to have permis- 
sion to drink beer in a bar. Once these ele- 
ments are recognized and encoded as “pre- 
condition” and “action,” the abstract rules of 
the schema can be applied, leading to ap- 
propriate card choices. This theory does not 
suppose that some general process of logi- 
cal reasoning is being facilitated. The authors 
later added an obligation schema to explain 
the perspective shift effect discussed previ- 
ously (Holyoak & Cheng, 1995). The rules of 
the obligation schema change the pattern of 
card choices, and the perspective determines 
which schema is retrieved and applied. 

A well-known but somewhat controver- 
sial theory is that choices on the deontic se- 
lection task are determined by Darwinian 
algorithms for social contracts, leading to 
cheater detection, or else by an innate hazard 
avoidance module (Cosmides, 1989; Fiddick 
et al., 2000). The idea is that such mod- 
ules would have been useful in the evolv- 
ing environment, although that does not in 
itself constitute evidence for them (Fodor, 
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and environmental biology, this work has 
been subject to a number of criticisms in the 
psychological literature (Cheng & Holyoak, 
1989; Evans & Over, 1996; Sperber, Cara, 
& Girotto, 1995; Sperber & Girotto, 2002). 
One criticism is that the responses that are 
predicted are those that would be adap- 
tive in contemporary society and so could 
be accounted for by social learning in the 
lifetime of the individual; another is that 
the effects to which the theory is applied 
can be accounted for by much more gen- 
eral cognitive processes. These include the- 
ories that treat the selection task as a decision 
task in which people make choices in accord 
with expected utility (Evans & Over, 1996; 
Manktelow & Over, 1991; Oaksford & 
Chater, 1994), as well as a theory applying 
principles of pragmatic relevance (Sperber 
et al., 1995). 

Regardless of which — if any — of these ac- 
counts may be correct, it is clear that prag- 
matic process heavily influences the deontic 
selection task. I have more to say about this 
in a later section of the chapter when dis- 
cussing “dual process” theory. 


Biasing Effects of Content and Context 


In contrast with the claims of facilitation ef- 
fects on the Wason selection task, psycholo- 
gists have produced evidence that introduc- 
ing real world knowledge may bias responses 
to deductive reasoning tasks. It is known, 
for example, that certain logically valid in- 
ferences that people normally draw can be 
suppressed when people introduce back- 
ground knowledge (see Evans et al., 1993, 
pp. 55-61). Suppose you give people the 
following problem: 


If she meets her friend, she will go to a 
play. 

She meets her friend. 

What follows? 


Nearly everyone will say, that she will go 
to the play. This is a very simple and, of 
course, valid argument known in logic as MP. 
Many participants will also make the MT in- 


“she does not go to the play,” inferring that 
“she does not meet her friend.” These infer- 
ences are easily defeated by additional infor- 
mation, however, a process known techni- 
cally as defeasible inference (Elio & Pelletier, 
1997; Oaksford & Chater, 1991). Suppose 
we add an extra statement: 


If she meets her friend, she will to go a 
play. 

If she has enough money, she will go to a 
play. 

She meets her friend. 


What follows? 


In one study (Byrne, 1989), 96% of par- 
ticipants gave the conclusion “she goes to 
the play” for the first MP problem, but only 
38% for the second problem. In standard 
logic, an argument that follows from some 
premises must still follow if you add new 
information. What is happening psychologi- 
cally in the second case is that the extra con- 
ditional statement introduces doubt about 
the truth of the first. People start to think 
that, even though she wants to go to the 
play with her friend, she might not be able 
to afford it, and the lack of money will pre- 
vent her. The same manipulation inhibits the 
MT inference. 

This work illustrates the difficulty of us- 
ing the term “bias” in deductive reasoning 
research. Because a valid inference has been 
suppressed, the effect is technically a bias. 
However, the reasoning of the participants 
in this experiment seems perfectly reason- 
able and indeed more adaptive to everyday 
needs than a strictly logical answer would 
have been. A related finding is that, even 
though people may be told to assume the 
premises of arguments are true, they are re- 
luctant to draw conclusions if they person- 
ally do not believe the premises. In real life, 
of course, it makes perfect sense to base your 
reasoning only on information that you be- 
lieve to be true. 

In logic, there is a distinction drawn be- 
tween a valid inference and a sound infer- 
ence. A valid inference may lead to a false 
conclusion, if at least one premise is false, as 
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All students are lazy. 
No lazy people pass examinations. 
Therefore, no students pass examinations. 


The falsity of the previous conclusion is 
more immediately evident than that of ei- 
ther of the premises. However, the argu- 
ment is valid, and so at least one premise 
must be false. A sound argument is a valid 
argument based on true premises and has 
the merit of guaranteeing a true conclu- 
sion. Because the standard deductive rea- 
soning task includes instructions to assume 
the premises, as well as to draw necessary 
conclusions, psychologists generally assume 
they have requested their participants to 
make validity judgments. However, there is 
evidence that when familiar problem con- 
tent is used, people respond as though they 
had been asked to judge soundness instead 
(Thompson, 2001). This might well account 
for the suppression of MP. The inference is 
so obvious that it can hardly reflect a failure 
in reasoning. 

People are also known to be influenced 
by the believability of the conclusion of the 
argument presented, reliably (and usually 
massively) preferring to endorse the valid- 
ity of arguments with believable rather than 
unbelievable conclusions, the so-called “be- 
lief bias” effect. The standard experiment 
uses syllogisms and independently manipu- 
lates the believability of the conclusion and 
the validity of the argument. People accept 
both more valid arguments (logic effect) and 
more believable conclusions (belief effect), 
and the two factors normally interact (Evans, 
Barston, & Pollard, 1983). This is because 
the belief bias effect is much stronger on 
invalid than valid arguments. The effect is 
really misnamed, however, because as we 
saw in our earlier discussion, people tend 
to endorse many fallacies when engaged in 
abstract syllogistic reasoning. When belief- 
neutral content is included in belief bias 
experiments, the effect of belief is shown 
to be largely negative: Unbelievable conclu- 
sions cause people to withhold fallacies that 
they would otherwise have made (Evans, 


well call it belief debias! 

Could people’s preference for sound ar- 
guments explain the belief bias effect? 
Many experiments in the literature have 
failed to control for the believability of 
premises. However, this can be done by in- 
troducing nonsense linking terms, as in the 
following syllogism: 


All fish are phylones. 
All phylones are trout. 
Therefore, all fish are trout. 


Because no one knows what a phylone 
is, he or she can hardly be expected to 
have any prior belief about either premise. 
However, the conclusion is clearly unbeliev- 
able, and the same technique can be made 
to render believable conclusions. Newstead, 
Pollard, Evans, and Allen (1992) found sub- 
stantial belief bias effects with such syllo- 
gisms. However, it could still be the case 
that people resist arguments with false con- 
clusions because such arguments must by 
definition be unsound. As we observed ear- 
lier, if the argument is valid and the conclu- 
sion false, at least one premise must be false, 
even if we cannot tell which one. For fur- 
ther discussion of this and related issues, see 
Evans et al. (2001) and Klauer, Musch, and 
Naumer (2000). 


Dual-Process Theory 


The deductive reasoning paradigm has 
yielded a wealth of psychological data over 
the past 40 years or so. Understanding 
the issues involved has been assisted by 
more recent developments in dual-process 
theories of reasoning (Evans, 2003; Evans 
& Over, 1996; Sloman, 1996; Stanovich, 
1999), which have gradually evolved from 
much earlier proposals in the reasoning lit- 
erature (Evans, 1984; Wason & Evans, 1975) 
and has been linked with research on im- 
plicit learning (see Litman & Reber, Chap. 
18; Dienes & Perner, 1999; Reber, 1993) 
and intuitive judgment (Gilovich & Griffin, 
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Kahneman & Frederick, Chap. 12). The idea 
is that there are two distinct cognitive sys- 
tems with different evolutionary histories. 
System 1 (to use Stanovich’s terminology) 
is the ancient system that relies on asso- 
ciative learning through distributed neural 
networks and may also reflect the opera- 
tion of innate modules. It is really a bundle 
of systems that most theorists regarded as 
implicit, meaning that only the final prod- 
ucts of such a process register in conscious- 
ness, and they may stimulate actions without 
any conscious reflection. System 2, in con- 
trast, is evolutionarily recent and arguably 
unique to humans. This system requires use 
of central working memory resources and 
is therefore slow and sequential in nature. 
System 2 function relates to general mea- 
sures of cognitive ability such as IQ, whereas 
system 1 function does not (Reber, 1993; 
Stanovich, 1999). However, system 2 allows 
us to engage in abstract reasoning and hy- 
pothetical thinking. There is more recent 
supporting evidence of a neuropsycholog- 
ical nature for this theory. When resolv- 
ing belieflogic conflicts in the belief bias 
paradigm, the response that dominates cor- 
relates with distinct areas of brain activ- 
ity (Goel, Buchel, Rith, & Olan, 2000; see 
Goel, Chap. 20). 

Dual-process theory can help us make 
sense of much of the research on deductive 
reasoning that we have been discussing. It 
seems that the default mode of everyday rea- 
soning is pragmatic, reflecting the associative 
processes of system 1. Deductive reasoning 
experiments, however, include instructions 
that require a conscious effort at deduction 
and often require the suppression of prag- 
matic processes because we are asked to dis- 
regard relevant prior belief and knowledge. 
Hence, reasoning tasks often require strong 
system 2 intervention if they are to be solved. 
In support of this theory, Stanovich (1999) 
reviewed a large research program in which 
it was consistently shown that participants 
with high SAT scores (a measure of general 
cognitive ability) produced more normative 
solutions than those with lower scores on 
a wide range of reasoning, decision, and 


system 2. 

Consider the Wason selection task, for 
example. The abstract indicative version, 
which defeats most people, contains no 
helpful pragmatic cues and thus requires 
abstract logical reasoning for its solution. 
Stanovich and West (1998) accordingly 
showed that the small numbers who solve it 
have significantly higher SAT scores. How- 
ever, they also showed no difference in SAT 
scores between solvers and nonsolvers of 
the deontic selection task. This makes sense 
because the pragmatic processes that ac- 
count for the relative ease of this task are 
of the kind attributed in the theory to sys- 
tem 1. However, this does call into ques- 
tion whether the deontic selection task re- 
ally requires a process that we would want 
to call reasoning. The solution appears to 
be provided automatically, without con- 
scious reflection. 

If the theory is right, then system 2 in- 
tervention occurs mostly because of the 
use of explicit instructions requiring an ef- 
fort at deduction. We know that the in- 
structions used have a major influence on 
the response people make (Evans, Allen, 
Newstead, & Pollard, 1994; George, 1995; 
Stevenson & Over, 1995). The more in- 
structions emphasize logical necessity, the 
more logical the responding; when instruc- 
tions are relaxed and participants are asked 
if a conclusion follows, responses are much 
more strongly belief based. The ability to 
resist belief in belieflogic conflict prob- 
lems when instructed to reason logically is 
strongly linked to measures of cognitive abil- 
ity (Stanovich & West, 1997), and the same 
facility is known to decline sharply in old 
age (Gilinsky & Judd, 1994; see Salthouse, 
Chap. 24). This provides strong converging 
evidence for dual systems of reasoning (see 
also Sloman, 2002). 


Conclusions and Future Directions 


Research on deductive reasoning was origi- 
nally stimulated by the traditional interest in 
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rational basis for human thinking. This ratio- 
nale has been considerably undermined over 
the past 40 years because many psycholo- 
gists have abandoned logic, first as a descrip- 
tive and later as a normative system for hu- 
man reasoning (Evans, 2002). Research with 
the deduction paradigm has also shown, as 
indicated in this chapter, that pragmatic pro- 
cesses have a very large influence once re- 
alistic content and context are introduced. 
Studying such processes using the paradigm 
necessarily defines them as biases because 
the task requires one to assume premises 
and draw necessary conclusions. However, it 
is far from clear that such biases should be 
regarded as evidence of irrationality, as dis- 
cussed earlier. 

The deductive reasoning field has seen 
discussion and debate of a wide range of 
theoretical ideas, a number of which have 
been described here. This includes the long- 
running debate over whether rule-based 
mental logics or mental model theory pro- 
vides the better account of basic deductive 
competence, as well as the development of 
accounts based on content-specific reason- 
ing, such as pragmatic reasoning schemas, 
relevance theory, and Darwinian algorithms. 
It has been a major focus for the develop- 
ment of dual-process theories of cognition, 
even though these have a much wider appli- 
cation. It has also been one of the major fields 
(alongside intuitive and statistical judgment) 
in which cognitive biases have been studied 
and their implications for human rationality 
debated at length. 

So where does the future of the deduction 
paradigm lie? I have suggested (Evans, 2002) 
that we should use a much wider range of 
methods for studying human reasoning, es- 
pecially when we are interested in investi- 
gating the pragmatic reasoning processes of 
system 1. In fact, there is no point at all in 
instructing people to make an effort at de- 
duction unless we are interested in system 
2 reasoning or want to set the two systems 
in conflict. However, this conflict is of both 
theoretical and practical interest and will un- 
doubtedly continue to be studied using the 
deduction paradigm. It is important, how- 


we are doing. It is no longer appropriate to 
equate performance on deductive reasoning 
tasks with rationality or to assume that logic 
provides an appropriate normative account 
of everyday, real world reasoning. 
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CHAPTER 9 


Mental Models and Thought 


P.N. Johnson-Laird 


How do we think? One answer is that 
we rely on mental models. Perception yields 
models of the world that lie outside us. 
An understanding of discourse yields mod- 
els of the world that the speaker describes 
to us. Thinking, which enables us to antic- 
ipate the world and to choose a course of 
action, relies on internal manipulations of 
these mental models. This chapter is about 
this theory, which it refers to as the model 
theory, and its experimental corroborations. 
The theory aims to explain all sorts of think- 
ing about propositions, that is, thoughts ca- 
pable of being true or false. There are other 
sorts of thinking — the thinking, for in- 
stance, of a musician who is improvising. 
In daily life, unlike the psychological labo- 
ratory, no clear demarcation exists between 
one sort of thinking and another. Here is 
a protocol of a typical sequence of every- 
day thoughts: 


I had the book in the hotel’s restaurant, 
and now I’ve lost it. So, either I left it in the 
restaurant, or it fell out of my pocket on the 
way back to my room, or it’s somewhere 
here in my room. It couldn’t have fallen 


from my pocket — my pockets are deep and 
I walked slowly back to my room — and so 
it’s here or in the restaurant. 


Embedded in this sequence is a logical de- 
duction of the form: 


AorBorC. 
Not B. 
Therefore, A or C. 


The conclusion is valid: It must be true given 
that the premises are true. However, other 
sorts of thinking occur in the protocol (e.g., 
the inference that the book could not have 
fallen out of the protagonist’s pocket). 

A simple way to categorize thinking about 
propositions is in terms of its effects on se- 
mantic information (Johnson-Laird, 1993). 
The more possibilities an assertion rules out, 
the greater the amount of semantic informa- 
tion it conveys (Bar-Hillel & Carnap, 1964). 
Any step in thought from current premises 
to a new conclusion therefore falls into one 
of the following categories: 


¢ The premises and the conclusion elimi- 
nate the same possibilities. 
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more possibility over those the conclusion 
eliminates. 

¢ The conclusion eliminates at least one 
more possibility over those the premises 
eliminate. 


¢ The premises and conclusion eliminate 
disjoint possibilities. 

¢ The premises and conclusion eliminate 
overlapping possibilities. 


The first two categories are deductions (see 
Evans, Chapter 11). The third category in- 
cludes all the traditional cases of induction, 
which in general is definable as any thought 
yielding such an increase in semantic infor- 
mation (see Sloman & Lagnado, Chap. 3). 
The fourth category occurs only when the 
conclusion is inconsistent with the premises. 
The fifth case occurs when the conclusion 
is consistent with the premises but refutes 
at least one premise and adds at least one 
new proposition. Such thinking goes beyond 
induction. It is associative or creative (see 
Sternberg, Chap. 13). 

The model theory aims to explain all 
propositional thinking, and this chapter il- 
lustrates its application to the five preceding 
categories. The chapter begins with the his- 
tory of the model theory. It then outlines 
the current theory and its account of de- 
duction. It reviews some of the evidence for 
this account. It shows how the theory ex- 
tends to probabilistic reasoning. It then turns 
to induction, and it describes the uncon- 
scious inferences that occur in understand- 
ing discourse. It shows how models underlie 
causal relations and the creation of expla- 
nations. Finally, it assesses the future of the 
model theory. 


The History of Mental Models 


In the seminal fifth chapter of his book, The 
Nature of Explanation, Kenneth Craik (1943) 
wrote: 


If the organism carries a “small-scale 
model” of external reality and of its own 


try out various alternatives, conclude which 
is the best of them, react to future situations 
before they arise, utilize the knowledge of 
past events in dealing with the present and 
the future, and in every way to react in 
a much fuller, safer, and more competent 
manner to the emergencies which face it. 


This same process of internal imitation of 
the external world, Craik wrote, is carried 
out by mechanical devices such as Kelvin’s 
tidal predictor. Craik died in 1945, before 
he could develop his ideas. Several earlier 
thinkers had, in fact, anticipated him (see 
Johnson-Laird, 2003). Nineteenth-century 
physicists, including Kelvin, Boltzmann, and 
Maxwell, stressed the role of models in 
thinking. In the twentieth century, physicists 
downplayed these ideas with the advent of 
quantum theory (but cf Deutsch, 1997). 

One principle of the modern theory 
is that the parts of a mental model and 
their structural relations correspond to those 
which they represent. This idea has many 
antecedents. It occurs in Maxwell’s (1911) 
views on diagrams, in Wittgenstein’s (1922) 
“picture” theory of meaning, and in Kohler’s 
(1938) hypothesis of an isomorphism be- 
tween brain fields and the world. However, 
the nineteenth-century grandfather of the 
model theory is Charles Sanders Peirce. 

Peirce coinvented the main system of 
logic known as predicate calculus, which gov- 
erns sentences in a formal language contain- 
ing idealized versions of negation, sentential 
connectives such as “and” and “or,” and quan- 
tifiers such as “all” and “some.” Peirce devised 
two diagrammatic systems of reasoning, not 
to improve reasoning, but to display its un- 
derlying mental steps (see Johnson-Laird, 
2002). He wrote: 


Deduction is that mode of reasoning which 
examines the state of things asserted in the 
premisses, forms a diagram of that state 
of things, perceives in the parts of the di- 
agram relations not explicitly mentioned in 
the premisses, satisfies itself by mental ex- 
periments upon the diagram that these re- 
lations would always subsist, or at least 
would do so in a certain proportion of cases, 
and concludes their necessary, or probable, 
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refers to paragraph 66 of Volume of Peirce, 
1931-1958). 


Diagrams can be iconic, in other words, have 
the same structure as what they represent 
(Peirce, 4.447). It is the inspection of an 
iconic diagram that reveals truths other than 
those of the premises (2.279, 4.530). Hence, 
Peirce anticipates Maxwell, Wittgenstein, 
Kohler, and the model theory. Mental mod- 
els are as iconic as possible (Johnson-Laird, 
1983, pp. 125, 136). 

A resurgence of mental models in cog- 
nitive science began in the 1970s. Theorists 
proposed that knowledge was represented in 
mental models, but they were not wed to 
any particular structure for models. Hayes 
(1979) used the predicate calculus to de- 
scribe the naive physics of liquids. Other 
theorists in artificial intelligence proposed 
accounts of how to envision models and 
use them to simulate behavior (de Kleer, 
1977). Psychologists similarly examined 
naive and expert models of various domains, 
such as mechanics (McCloskey, Caramazza, 
& Green, 1980) and electricity (Gentner 
& Gentner, 1983). They argued that vi- 
sion yields a mental model of the three- 
dimensional structure of the world (Marr, 
1982). They proposed that individuals use 
these models to simulate behavior (e.g., 
Hegarty, 1992; Schwartz & Black, 1996). 
They also studied how models develop (e.g., 
Vosniadou & Brewer, 1992; Halford, 1993), 
how they serve as analogies (e.g., Holland, 
Holyoak, Nisbett, & Thagard, 1986; see 
Holyoak, Chap. 6), and how they help in 
the diagnosis of faults (e.g., Rouse & Hunt, 
1984). Artifacts, they argued, should be de- 
signed so users easily acquire models of them 
(e.g., Ehrlich, 1996; Moray, 1990, 1999). 

Discourse enables humans to experience 
the world by proxy, and so another early 
hypothesis was that comprehension yields 
models of the world (Johnson-Laird, 1970). 
The models are iconic in these ways: They 
contain a token for each referent in the 
discourse, properties corresponding to the 
properties of the referents, and relations cor- 
responding to the relations among the refer- 


tics (e.g., Bransford, Barclay, & Franks, 1972), 
linguistics (Karttunen, 1976), artificial intel- 
ligence (Webber, 1978), and formal seman- 
tics (Kamp, 1981). Experimental evidence 
corroborated the hypothesis, showing that 
individuals rapidly forget surface and un- 
derlying syntax (Johnson-Laird & Stevenson, 
1970), and even the meaning of individ- 
ual sentences (Garnham, 1987). They re- 
tain only models of who did what to whom. 
Psycholinguists discovered that models are 
constructed from the meanings of sentences, 
general knowledge, and knowledge of hu- 
man communication (e.g., Garnham, 2001; 
Garnham & Oakhill, 1996; Gernsbacher, 
1990; Glenberg, Meyer, & Lindem, 1987). 

Another early discovery was that con- 
tent affects deductive reasoning (Wason 
& Johnson-Laird, 1972; see Evans, Chap. 
8), which was hard to reconcile with 
the then dominant view that reason- 
ers depend on formal rules of inference 
(Braine, 1978; Johnson-Laird, 1975; Osher- 
son, 1974-1976). Granted that models come 
from perception and discourse, they could 
be used to reason (Johnson-Laird, 1975): 
An inference is valid if its conclusion holds 
in all the models of the premises because 
its conclusion must be true granted that its 
premises are true. The next section spells out 
this account. 


Models and Deduction 


Mental models represent entities and per- 
sons, events and processes, and the opera- 
tions of complex systems. However, what 
is a mental model? The current theory is 
based on principles that distinguish mod- 
els from linguistic structures, semantic net- 
works, and other proposed mental represen- 
tations (Johnson-Laird & Byrne, 1991). The 
first principle is 


The principle of iconicity: A mental model 
has a structure that corresponds to the 
known structure of what it represents. 


Visual images are iconic, but mental mod- 
els underlie images. Even the rotation of 
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tate three-dimensional models (Metzler & 
Shepard, 1982), and irrelevant images im- 
pair reasoning (Knauff, Fangmeir, Ruff, & 
Johnson-Laird, 2003; Knauff & Johnson- 
Laird, 2002). Moreover, many components 
of models cannot be visualized. 

One advantage of iconicity, as Peirce 
noted, is that models built from premises can 
yield new relations. For example, Schaeken, 
Johnson-Laird, and d’Ydewalle (1996) in- 
vestigated problems of temporal reasoning 
concerning such premises as 


John eats his breakfast before he listens to 
the radio. 


Given a problem based on several premises 
with the form: 


A before B. 
B before C. 
D while A. 
E while C. 


reasoners can build a mental model with 
the structure: 


A B C 
D E 


where the left-to-right axis is time, and the 
vertical axis allows different events to be 
contemporaneous. Granted that each event 
takes roughly the same amount of time, 
reasoners can infer a new relation: 


D before E. 


Formal logic less readily yields the conclu- 
sion. One difficulty is that an infinite num- 
ber of conclusions follow validly from any 
set of premises, and logic does not tell you 
which conclusions are useful. From the pre- 
vious premises, for instance, this otiose con- 
clusion follows: 


A before B, and B before C. 


Possibilities are crucial, and the second 
principle of the theory assigns them a central 
role: 


The principle of possibilities: Each mental 
model represents a possibility. 


Disjunction 

A B A or else B, but not both 
True True False 

True False True 

False True True 

False False False 


This principle is illustrated in sentential 
reasoning, which hinges on negation and 
such sentential connectives as “if” and “or.” 
In logic, these connectives have idealized 
meanings: They are truth-functional in that 
the truth-values of sentences formed with 
them depend solely on the truth-values of 
the clauses that they connect. For example, 
a disjunction of the form: A or else B but not 
both is true if A is true and B is false, and if 
A is false and B is true, but false in any other 
case. Logicians capture these conditions in a 
truth table, as shown in Table 9.1. Each row 
in the table represents a different possibility 
(e.g., the first row represents the possibility 
in which both A and B are true), and so here 
the disjunction is false. 

Naive reasoners do not use truth tables 
(Osherson, 1974-1976). Fully explicit mod- 
els of possibilities, however, are a step to- 
ward psychological plausibility. The fully ex- 
plicit models of the exclusive disjunction, 
A or else B but not both, are shown here on 
separate lines: 


A —B 
=k. B 
where “—” denotes negation. Table 9.2 


presents the fully explicit models for the 
main sentential connectives. Fully explicit 
models correspond exactly to the true rows 
in the truth table for each connective. As 
the table shows, the conditional If A then B 
is treated in logic as though it can be para- 
phrased as If A then B, and if not-A then B or 
not-B. The paraphrase does not do justice to 
the varied meanings of everyday condition- 
als (Johnson-Laird & Byrne, 2002). In fact, 
no connectives in natural language are truth 
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Pr@Bicledys tt dhl nips isaMeiehesgdddental Models of Possibilities 


Compatible with Sentences Containing the Principal Sentential Connectives 


Sentences Fully Explicit Models Mental Models 
A and B: A B A B 
Neither A nor B: aA aB aA aB 
A or else B but not both: A =B A 
aA B B 
A or B or both: A aB A 
aA B B 
A B A B 
If A then B: A B A B 
aA B 
aA =B 
If, and only if A, then B: A B A B 
aA —=B 


functional (see the section on implicit induc- 
tion and the modulation of models). 

Fully explicit models yield a more eff- 
cient reasoning procedure than truth tables. 
Each premise has a set of fully explicit mod- 
els, for example, the premises: 


1. A or else B but not both. 
2. Not-A. 


have the models: 


(Premise 1) 
A -=B 
=A B 


(Premise 2) 
=A 


Their conjunction depends on combining 
each model in one set with each model in 
the other set according to two main rules: 


¢ A contradiction between a pair of models 
yields the null model (akin to the empty 
set). 

¢ Any other conjunction yields a model of 
each proposition in the two models. 


The result is: 


Input Input Output 
from (1) from (2) 
A =B aA null model 
aA B aA aA B 


or in brief: 


“A B 


Because an inference is valid if its conclu- 
sion holds in all the models of the premises, 
it follows that: B. The same rules are 
used recursively to construct the models 
of compound premises containing multiple 
connectives. 

Because infinitely many conclusions fol- 
low from any premises, computer programs 
for proving validity generally evaluate con- 
clusions given to them by the user. Hu- 
man reasoners, however, can draw conclu- 
sions for themselves. They normally abide 
by two constraints (Johnson-Laird & Byrne, 
1991). First, they do not throw semantic in- 
formation away by adding disjunctive alter- 
natives. For instance, given a single premise, 
A, they never spontaneously conclude, A or 
B or both. Second, they draw novel conclu- 
sions that are parsimonious. For instance, 
they never draw a conclusion that merely 
conjoins the premises, even though such 
a deduction is valid. Of course, human 
performance rapidly degrades with com- 
plex problems, but the goal of parsimony 
suggests that intelligent programs should 
draw conclusions that succinctly express 
all the information in the premises. The 
model theory yields an algorithm that draws 
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1991, Chap. 9). 

Fully explicit models are simpler than 
truth tables but place a heavy load on work- 
ing memory. Mental models are still simpler 
because they are limited by the third princi- 
ple of the theory: 


The principle of truth: A mental model rep- 
resents a true possibility, and it represents a 
clause in the premises only when the clause 
is true in the possibility. 


The simplest illustration of the principle is 
to ask naive individuals to list what is possi- 
ble for a variety of assertions (Barrouillet & 
Lecas, 1999; Johnson-Laird & Savary, 1996). 
Given an exclusive disjunction, not-A or else 
B, they list two possibilities corresponding 
to the mental models: 


aA 
B 


The first mental model does not represent 
B, which is false in this possibility; and the 
second mental model does not represent not- 
A, which is false in this possibility, in other 
words, A is true. Hence, people tend to ne- 
glect these cases. Readers might assume that 
the principle of truth is equivalent to the 
representation of the propositions mentioned 
in the premises. However, this assumption 
yields the same models of A and B regardless 
of the connective relating them. The right 
way to conceive the principle is that it yields 
pared-down versions of fully explicit mod- 
els, which in turn map into truth tables. As 
we will see, the principle of truth predicts a 
striking effect on reasoning. 

Individuals can make a mental footnote 
about what is false in a possibility, and these 
footnotes can be used to flesh out mental 
models into fully explicit models. However, 
footnotes tend to be ephemeral. The most 
recent computer program implementing the 
model theory operates at two levels of 
expertise. At its lowest level, it makes no use 
of footnotes. Its representation of the main 
sentential connectives is summarized in Ta- 
ble 9.2. The mental models of a conditional, 
if A then B, are 


A B 


possibilities in which the antecedent of the 
conditional is false. In other words, there are 
alternatives to the possibility in which A and 
B are true, but individuals tend not to think 
explicitly about what holds in these possibil- 
ities. If they retain the footnote about what 
is false, then they can flesh out these mental 
models into fully explicit models. The men- 
tal models of the biconditional, If and only 
if, A then B, as Table 9.2 shows, are identical 
to those for the conditional. What differs is 
that the footnote now conveys that both A 
and B are false in the implicit model. The 
program at its higher level uses fully explicit 
models and so makes no errors in reasoning. 

Inferences can be made with mental mod- 
els using a procedure that builds a set of 
models for a premise and then updates them 
according to the other premises. From the 
premises, 


A or else B but not both. 
Not-A. 


the disjunction yields the mental models 


A 
B 


The categorical premise eliminates the first 
model, but it is compatible with the second 
model, yielding the valid conclusion, B. The 
rules for updating mental models are sum- 
marized in Table 9.3. 

The model theory of deduction began 
with an account of reasoning with quanti- 
fiers as in syllogisms such as: 


Some actuaries are businessmen. 
All businessmen are conformists. 


Therefore, some actuaries are 
conformists. 


A plausible hypothesis is that people con- 
struct models of the possibilities compati- 
ble with the premises and draw whatever 
conclusion, if any, holds in all of them. 
Johnson-Laird (1975) illustrated such an 
account with Euler circles. A premise of 
the form, Some A are B, however, is com- 
patible with four distinct possibilities, and 
the previous premises are compatible with 
16 distinct possibilities. Because the infer- 
ence is easy, reasoners may fail to consider 
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f a pair of models. Each procedure is presented 


with an accompanying example. Only mental models may be implicit and therefore call for the first 


two procedures 


1: The conjunction of a pair of implicit models yields the implicit model: 


...and...yield... 


2: The conjunction of an implicit model with a model representing propositions yields the null model 


(akin to the empty set) by default, for example, 


...and B C yield nil. 


But, if none of the atomic propositions (B C) is represented in the set of models containing the 
implicit model, then the conjunction yields the model of the propositions, for example, 


...and B C yield B C. 


3: The conjunction of a pair of models representing respectively a proposition and its negation yield 


the null model, for example, 


A -B and —A yield nil. 


4: The conjunction of a pair of models in which a proposition, B, in one model is not represented in 
the other model depends on the set of models of which this other model is a member. If B occurs in 
at least one of these models, then its absence in the current model is treated as negation, for 


example, 


A Band A yields nil. 


However, if B does not occur in one of these models (e.g., only its negation occurs in them), then its 
absence is treated as equivalent to its affirmation, and the conjunction (following the next 


procedure) is 


A Band A yields A B. 


5: The conjunction of a pair of fully explicit models free from contradiction update the second model 
with all the new propositions from the first model, for example, 


=A B and A C yield =A B C. 


all the possibilities (Erickson, 1974), or 
they may construct models that capture 
more than one possibility (Johnson-Laird & 
Bara, 1984). The program implementing the 
model theory accordingly constructs just one 
model for the previous premises: 


actuary [businessman] conformist 


actuary 


[businessman] conformist 


where each row represents a different sort of 
individual, the ellipsis represents the possi- 
bility of other sorts of individual, and the 
square brackets represent that the set of 
businessmen has been represented exhaus- 
tively — in other words, no more tokens 
representing businessmen can be added to 
the model. This model yields the conclusion 
that Some actuaries are conformists. There are 
many ways in which reasoners might use 
such models, and Johnson-Laird and Bara 


(1984) described two alternative strategies. 
Years of tinkering with the models for syl- 
logisms suggest that reasoning does not rely 
on a single deterministic procedure. The fol- 
lowing principle applies to thinking in gen- 
eral but can be illustrated for reasoning: 


The principle of strategic variation: Given 
a class of problems, reasoners develop a va- 
riety of strategies from exploring manipu- 
lations of models (Bucciarelli & Johnson- 
Laird, 1999). 


Stenning and his colleagues anticipated this 
principle in an alternative theory of syl- 
logistic reasoning (e.g., Stenning & Yule, 
1997). They proposed that reasoners focus 
on individuals who necessarily exist given 
the premises (e.g., given the premise Some 
A are B, there must be an A who is B). 
They implemented this idea in three differ- 
ent algorithms that all yield the same in- 
ferences. One algorithm is based on Euler 
circles supplemented with a notation for 
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of individuals in line with the node! theory, 
and one is based on verbal rules, such as 


If there are two existential premises, that 
is, that contain “some”, then respond that 
there is no valid conclusion. 


Stenning and Yule concluded from the 
equivalence of the outputs from these al- 
gorithms that a need exists for data be- 
yond merely the conclusions that reason- 
ers draw, and they suggested that reasoners 
may develop different representational sys- 
tems, depending on the task. Indeed, from 
Stérring (1908) to Stenning (2002), psy- 
chologists have argued that some reasoners 
may use Euler circles and others may use 
verbal procedures. 

The external models that reasoners con- 
structed with cut-out shapes corroborated 
the principle of strategic variation: Individ- 
uals develop various strategies (Bucciarelli 
& Johnson-Laird, 1999). They also overlook 
possible models of premises. Their search 
may be organized toward finding necessary 
individuals, as Stenning and Yule showed, 
but the typical representations of premises 
included individuals who were not neces- 
sary; for example, the typical representation 
of Some A are B was 


A B 
A B 
A 


A focus on necessary individuals is a partic- 
ular strategy. Other strategies may call for 
the representation of other sorts of individ- 
uals, especially if the task changes — a view 
consistent with Stenning and Yule’s theory. 
For example, individuals readily make the 
following sort of inference (Evans, Handley, 
Harper, & Johnson-Laird, 1999): 


Some A are B. 
Some B are C. 


Therefore, it is possible that Some A 
are C. 


Such inferences depend on the representa- 
tion of possible individuals. 

The model theory has been extended 
to some sorts of inference based on pre- 


MACGMaBes containing more than one quantifier 


(Johnson-Laird, Byrne, & Tabossi, 1989). 
Many such inferences are beyond the scope 
of Euler circles, although the general prin- 
ciples of the model theory still apply to 
them. Consider, for example, the inference 
(Cherubini & Johnson-Laird, 2004): 


There are four persons: Ann, Bill, Cath, 
and Dave. 

Everybody loves anyone who loves some- 
one. 


Ann loves Bill. 
What follows? 


Most people can envisage this model in 
which arrows denote the relation of loving: 


Cath Dave 


Ann —> Bill 


Hence, they infer that everyone loves Ann. 
However, if you ask them whether it follows 
that Cath loves Dave, they tend to respond 
no.” They are mistaken, but the inference 
calls for using the quantified premise again. 
The result is this model (strictly speaking, all 
four persons love themselves, too): 


eo 


It follows that Cath loves Dave, and people 
grasp its validity if it is demonstrated with 
diagrams. No complete model theory exists 
for inferences based on quantifiers and con- 
nectives (cf. Bara, Bucciarelli, & Lombardo, 
2001). However, the main principles of the 
theory should apply: iconicity, possibilities, 
truth, and strategic variation. 


Experimental Studies of 
Deductive Reasoning 


Many experiments have corroborated the 
model theory (for a bibliography, see the 
Web page created by Ruth Byrne: www.tcd. 
ie/Psychology/People/Ruth_Byrnelmental_ 
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orations of five predictions. 

Prediction 1: The fewer the models 
needed for an inference, and the simpler they 
are, the less time the inference should take 
and the less prone it should be to error. Fewer 
entities do improve inferences (e.g., Birney & 
Halford, 2002). Likewise, fewer models 
improve spatial and temporal reasoning 
(Byrne & Johnson-Laird, 1989; Carreiras & 
Santamaria, 1997; Schaeken, Johnson-Laird, 
& d’Ydewalle, 1996; Vandierendonck & De 
Vooght, 1997). Premises yielding one model 
take less time to read than corresponding 
premises yielding multiple models; how- 
ever, the difference between two and three 
models is often so small that it is un- 
likely that reasoners construct all three mod- 
els (Vandierendonck, De Vooght, Desim- 
pelaere, & Dierckx, 2000). They may builda 
single model with one element represented 
as having two or more possible locations. 

Effects of number of models have been 
observed in comparing one sort of sentential 
connective with another and in examining 
batteries of such inferences (see Johnson- 
Laird & Byrne, 1991). To illustrate these 
effects, consider the “double disjunction” 
(Bauer & Johnson-Laird, 1993): 


Ann is in Alaska or else Beth is in Barba- 
dos, but not both. 

Beth is in Barbados or else Cath is in 
Canada, but not both. 

What follows? 


Reasoners readily envisage the two possibil- 
ities compatible with the first premise, but 
it is harder to update them with those from 
the second premise. The solution is 


Ann in Alaska Cath in Canada 
Beth in Barbados 


People represent the spatial relations: Mod- 
els are not made of words. The two models 
yield the conclusion: Either Ann is in Alaska 
and Cath is in Canada or else Beth is in Bar- 
bados. An increase in complexity soon over- 


feats most people: 


Ann is in Alaska or Beth is in Barbados, 
or both. 

Beth is in Barbados or Cath is in Canada, 
or both. 

What follows? 


The premises yield five models, from which 
it follows: Ann is in Alaska and Cath is in 
Canada, or Beth is in Barbados, or all three. 
When the order of the premises reduces the 
number of models to be held in mind, rea- 
soning improves (Garcia-Madruga, Moreno, 
Carriedo, Gutiérrez, & Johnson-Laird, 2001; 
Girotto, Mazzocco, & Tasso, 1997; Mac- 
kiewicz & Johnson-Laird, 2003). 

Because one model is easier than many, 
an interaction occurs in modal reasoning. It is 
easier to infer that a situation is possible (one 
model of the premises suffices as an exam- 
ple) than that it is not possible (all the mod- 
els of the premises must be checked for a 
counterexample to the conclusion). In con- 
trast, it is easier to infer that a situation is 
not necessary (one counterexample suffices) 
than that it is necessary (all the models of 
the premises must be checked as examples). 
The interaction occurs in both accuracy and 
speed (Bell & Johnson-Laird, 1998; see also 
Evans et al., 1999). 

Prediction 2: Reasoners should err as a re- 
sult of overlooking models of the premises. 
Given a double disjunction (such as the pre- 
vious one), the most frequent errors were 
conclusions consistent with just a single 
model of the premises (Bauer & Johnson- 
Laird, 1993). Likewise, given a syllogism of 
the form, 

None of the A is a B. 

All the B are C. 
reasoners infer: None of the A is a C (New- 
stead & Griggs, 1999). They overlook the 


possibility in which Cs that are not Bs are 
As, and so the valid conclusion is 


Some of the C are not A. 


They may have misinterpreted the second 
premise, taking it also to mean that all 
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but many errors with syllogisms appear 
to arise because individuals consider only 
a single model (Bucciarelli & Johnson- 
Laird, 1999; Espino, Santamaria, & Garcia- 
Madruga, 2000). Ormerod proposed a “min- 
imal completion” hypothesis according to 
which reasoners construct only the min- 
imally necessary models (see Ormerod, 
Manktelow, & Jones, 1993; Richardson & 
Ormerod, 1997). Likewise, Sloutsky pos- 
tulated a process of “minimalization” in 
which reasoners tend to construct only sin- 
gle models for all connectives, thereby re- 
ducing them to conjunctions (Morris & 
Sloutsky, 2002; Sloutsky & Goldvarg, 1999). 
Certain assertions, however, do tend to 
elicit more than one model. As Byrne 
and her colleagues showed (eg., Byrne, 
2002; Byrne & McEleney, 2000; Byrne & 
Tasso, 1999), counterfactual conditionals 
such as 


If the cable hadn't been faulty then the 


printer wouldn't have broken 


tend to elicit models of both what is factu- 
ally the case, that is, 


cable faulty printer broken 


and what holds in a counterfactual possibil- 
ity 


= cable faulty — printer broken 
Prediction 3: Reasoners should be able to 
refute invalid inferences by envisaging coun- 
terexamples (i.e., models of the premises 
that refute the putative conclusion). There 
is no guarantee that reasoners will find a 
counterexample, but, where they do suc- 
ceed, they know that an inference is in- 
valid (Barwise, 1993). The availability of a 
counterexample can suppress fallacious in- 
ferences from a conditional premise (Byrne, 
Espino, & Santamaria, 1999; Markovits, 
1984; Vadeboncoeur & Markovits, 1999). 
Nevertheless, an alternative theory based 
on mental models has downplayed the 
role of counterexamples (Polk & Newell, 


times failed to show their use (e.g., New- 
stead, Handley, & Buck, 1999). However, 
when reasoners had to construct exter- 
nal models (Bucciarelli & Johnson-Laird, 
1999), they used counterexamples (see 
also Neth & Johnson-Laird, 1999; Roberts, 
in press). 

There are two sorts of invalid conclusions. 
One sort is invalid because the conclusion is 
disjoint with the premises; for example, 


A or B or both. 
B or else C but not both. 
Therefore, not-A and C. 


The premises have three fully explicit 
models: 


A a=B C 
aA B aC 
A B aC 


The conclusion is inconsistent with the 
premises because it conflicts with each of 
their models. But, another sort of invalid 
conclusion is consistent with the premises 
but does not follow from them such as the 
conclusion A and not-C from the previous 
premises. It is consistent with the premises 
because it corresponds to their third model, 
but it does not follow from them because 
the other two models are counterexamples. 
Reasoners usually establish the invalidity of 
the first sort of conclusion by detecting its 
inconsistency with the premises, but they 
refute the second sort of conclusion with a 
counterexample (Johnson-Laird & Hasson, 
2003). An experiment using functional mag- 
netic resonance imaging showed that reason- 
ing based on numeric quantifiers, such as at 
least five — as opposed to arithmetical cal- 
culation based on the same premises — de- 
pended on the right frontal hemisphere. A 
search for counterexamples appeared to ac- 
tivate the right frontal pole (Kroger, Cohen, 
& Johnson-Laird, 2003). 

Prediction 4: Reasoners should succumb 
to illusory inferences, which are compelling 
but invalid. They arise from the principle of 
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what is false. Consider the problem: 


Only one of the following assertions is true 
about a particular hand of cards: 


There is a king in the hand or there is 
an ace, or both. 

There is a queen in the hand or there is 
an ace, or both. 

There is a jack in the hand or there is a 
ten, or both. 


Is it possible that there is an ace in the 


hand? 


Nearly everyone responds, “yes” (Goldvarg 
& Johnson-Laird, 2000). They grasp that 
the first assertion allows two possibilities in 
which an ace occurs, so they infer that an ace 
is possible. However, it is impossible for an 
ace to be in the hand because both of the first 
two assertions would then be true, contrary 
to the rubric that only one of them is true. 
The inference is an illusion of possibility: 
Reasoners infer wrongly that a card is pos- 
sible. A similar problem to which reason- 
ers tend to respond “no” and thereby com- 
mit an illusion of impossibility is created by 
replacing the two occurrences of “there is 
an ace” in the problem with, “there is not 
an ace.” When the previous premises were 
stated with the question 


Is it possible that there is a jack? 


the participants nearly all responded “yes,” 
again. They considered the third assertion, 
and its mental models showed that there 
could be a jack. However, this time they 
were correct: The inference is valid. Hence, 
the focus on truth does not always lead to er 
ror, and experiments have accordingly com- 
pared illusions with matching control prob- 
lems for which the neglect of falsity should 
not affect accuracy. 

The computer program implementing 
the theory shows that illusory inferences 
should be sparse in the set of all possi- 
ble inferences. However, experiments have 
corroborated their occurrence in reasoning 
about possibilities, probabilities, and causal 


some different illusions. Studies have used 
remedial procedures to reduce the illusions 
(eg., Santamaria & Johnson-Laird, 2000). 
Yang taught participants to think explic- 
itly about what is true and what is false. 
The difference between illusions and con- 
trol problems vanished, but performance 
on the control problems fell from almost 
100% correct to around 75% correct (Yang 
& Johnson-Laird, 2000). The principle of 
truth limits understanding, but it does so 
without participants realizing it. They were 
highly confident in their responses, no less 
so when they succumbed to an illusion 
than when they responded correctly to a 
control problem. 

The rubric, “one of these assertions is 
true and one of them is false,” is equiva- 
lent to an exclusive disjunction between two 
assertions: A or else B, but not both. This us- 
age leads to compelling illusions that seduce 
novices and experts alike, for example, 


If there is a king then there is an ace, or 
else if there isn’t a king then there is an 
ace. 


There is a king. 
What follows? 


More than 2000 individuals have tackled this 
problem (see Johnson-Laird & Savary, 1999), 
and nearly everyone responded, “there is an 
ace.” The prediction of an illusion depends 
not on logic but on how other participants 
interpreted the relevant connectives in sim- 
ple assertions. The preceding illusion occurs 
with the rubric: One of these assertions is 
true and one of them is false applying to the 
conditionals. That the conclusion is illusory 
rests on the following assumption, corrobo- 
rated experimentally: Ifa conditional is false, 
then one possibility is that its antecedent 
is true and its consequent is false. If skep- 
tics think that the illusory responses are 
correct, then how do they explain the ef- 
fects of a remedial procedure? They should 
then say that the remedy produced illusions. 
Readers may suspect that the illusions arise 
from the artificiality of the problems, which 
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Table 9.4. Somredtiente ta dohn kiitos dsisriviiaedtoray mith percentages of illusory responses. Each 


study examined other sorts of illusions and matched control problems 


Percentages of 


Premises Illusory responses illusory responses 
1. If A then B or else B. A. B. 100 
2. Either A and B, or else C and D. A. B. 87 
3. If A then B or else if C then B. A and B. Possibly both are true. 98 
4. Aor else not both Band C. A and not B. Possibly both are true. gl 
5. One true and one false: not-A or not-B, or neither. 

Not-C and not-B. Possibly not-C and not-B. 85 
6. Only one is true: At least some A are not B. 

No A are B. Possibly No B are A. 95 

7. If one is true so is the other: A or else not B. A. A is more likely than B. 95 
8. If one is true so is the other: A ifand only if B.A. —_ A is equally likely as B. go 


Note: 1 is from Johnson-Laird and Savary (1999), 2 is from Walsh and Johnson-Laird (2003), 3 is from Johnson- 
Laird, Legrenzi, Girotto, and Legrenzi (2000), 4 is from Legrenzi, Girotto, and Johnson-Laird (2003), 5 is from 


Goldvarg and Johnson-Laird (2000), 6 is from Experiment 2, Yang and Johnson-Laird (2000), and 7 and 8 are from 
Johnson-Laird and Savary (1996). 


never occur in real life and therefore 
confuse the participants. The problems may 
be artificial, although analogs do occur in 
real life (see Johnson-Laird & Savary, 1999), 
and artificiality fails to explain the cor 
rect responses to the controls or the high 
ratings of confidence in both illusory and 
control conclusions. 


Prediction 5: Naive individuals should de- 


velop different reasoning strategies based on 
models. When they are tested in the labo- 
ratory, they start with only rough ideas of 
how to proceed. They can reason, but not 
efficiently. With experience but no feedback 
about accuracy, they spontaneously develop 
various strategies (Schaeken, De Vooght, 
Vandierendonck, & d’Ydewalle, 1999). De- 
duction itself may be a strategy (Evans, 
2000), and people may resort to it more 
in Western cultures than in East Asian cul- 
tures (Peng & Nisbett, 1999). However, 
deduction itself leads to different strate- 
gies (Van der Henst, Yang, & Johnson- 
Laird, 2002). Consider a problem in which 
each premise is compound, that is, contains 
a connective: 


A if and only if B. 

Either B or else C, but not both. 

C if and only if D. 

Does it follow that if not A then D? 


where A, B, ... refer to different colored 


marbles in a box. Some individuals develop 
a strategy based on suppositions. They say, 
for example, 


Suppose not A. It follows from the first 
premise that not B. It follows from the sec- 
ond premise that C. The third premise then 
implies D. So, yes, the conclusion follows. 


Some individuals construct a chain of con- 
ditionals leading from one clause in the con- 
clusion to the other — for example: If D then 
C, If C then not B, If not B then not A. Oth- 
ers develop a strategy in which they enu- 
merate the different possibilities compatible 
with the premises. For example, they draw 
a horizontal line across the page and write 
down the possibilities for the premises: 


A B 


C D 


When individuals are taught to use this 
strategy, as Victoria Bell showed in un- 
published studies, their reasoning is faster 
and more accurate. The nature of the 
premises and the conclusion can bias rea- 
soners to adopt a predictable strategy (e.g., 
conditional premises encourage the use of 
suppositions, whereas disjunctive premises 
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(Van der Henst et al., 2002). 

Reasoners develop diverse strategies for 
relational reasoning (eg., Goodwin & 
Johnson-Laird, in press; Roberts, 2000), sup- 
positional reasoning (e.g., Byrne & Hand- 
ley, 1997), and reasoning with quantifiers 
(e.g., Bucciarelli & Johnson-Laird, 1999). 
Granted the variety of strategies, there re- 
mains a robust effect: Inferences from one 
mental model are easier than those from 
more than one model (see also Espino, 
Santamaria, Meseguer, & Carreiras, 2000). 
Different strategies could reflect different 
mental representations (Stenning & Yule, 
1997), but those so far discovered are all 
compatible with models. Individuals who 
have mastered logic could make a strategic 
use of formal rules. Given sufficient expe- 
rience with a class of problems, individuals 
begin to notice some formal patterns. 


Probabilistic Reasoning 


Reasoning about probabilities is of two 
sorts. In intensional reasoning, individuals 
use heuristics to infer the probability of 
an event from some sort of index, such as 
the availability of information. In extensional 
reasoning, they infer the probability of an 
event from a knowledge of the different ways 
in which it might occur. This distinction 
is due to Nobel laureate Daniel Kahneman 
and the late Amos Tversky, who together 
pioneered the investigation of heuristics 
(Kahneman, Slovic, & Tversky, 1982; see 
Kahneman & Frederick, Chap. 12). Studies 
of extensional reasoning focused at first on 
“Bayesian” reasoning in which participants 
try to infer a conditional probability from the 
premises. These studies offered no account 
of the foundations of extensional reasoning. 
The model theory filled the gap (Johnson- 
Laird, Legrenzi, Girotto, Legrenzi, & Cav- 
erni, 1999), and the present section outlines 
its account. 

Mental models represent the extensions 
of assertions (i.e., the possibilities to which 
they refer). The theory postulates 


mental model is assumed to be equiproba- 
ble, unless there are reasons to the contrary. 


The probability of an event accordingly de- 
pends on the proportion of models in which 
it occurs. The theory also allows that mod- 
els can be tagged with numerals denoting 
probabilities or frequencies of occurrence, 
and that simple arithmetical operations 
can be carried out on them. Shimojo and 
Ichikawa (1989) and Falk (1992) proposed 
similar principles for Bayesian reasoning. 
The present account differs from theirs in 
that it assigns equiprobability, not to ac- 
tual events, but to mental models. And 
equiprobability applies only by default. An 
analogous principle of “indifference” oc- 
curred in classical probability theory, but it 
is problematic because it applies to events 
(Hacking, 1975). 
Consider a simple problem such as 


In the box, there is a green ball or a blue 
ball or both. 

What is the probability that both the 
green and the blue ball are there? 


The premise elicits the mental models: 


green 
blue 
blue 


green 


Naive reasoners follow the equiprobability 
principle, and infer the answer, “1/3.” An ex- 
periment corroborated this and other pre- 
dictions based on the mental models for 
the connectives in Table 9.2 (Johnson-Laird 
et al., 1999). 

Conditional probabilities are on the bor- 
derline of naive competence. They are dif- 
ficult because individuals need to consider 
several fully explicit models. Here is a typi- 
cal Bayesian problem: 


The patient’s PSA score is high. If he doesn’t 
have prostate cancer, the chances of such 
a value is 1 in 1000. Is he likely to have 
prostate cancer? 


Many people respond, “yes.” However, they 
are wrong. The model theory predicts the 
error: Individuals represent the conditional 
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model and one implicit model tagged with 
their chances: 


high PSA 1 
999 


— prostate cancer 


The converse conditional probability has 
the same mental models, and so people as- 
sume that if the patient has a high PSA 
the chances are only 1 in 1000 that he 
does not have prostate cancer. Because the 
patient has a high PSA, then he is highly 
likely to have prostate cancer (999/1000). 
To reason correctly, individuals must envis- 
age the complete partition of possibilities 
and chances. However, the problem fails to 
provide enough information. It yields only: 


— prostate cancer high PSA 1 
— prostate cancer — high PSA 999 
prostate cancer high PSA ? 


prostate cancer — high PSA ? 

There are various ways to provide the miss- 
ing information. One way is to give the 
base rate of prostate cancer, which can be 
used with Bayes’s theorem from the prob- 
ability calculus to infer the answer. How- 
ever, the theorem and its computations 
are beyond naive individuals (Kahneman & 
Tversky, 1973; Phillips & Edwards, 1966). 
The model theory postulates an alternative: 


The subset principle: Given a complete 
partition, individuals infer the conditional 
probability, P(A | B), by examining the sub- 
set of B that is A and computing its propor- 
tion (Johnson-Laird et al., 1999). 


If models are tagged with their absolute fre- 
quencies or chances, then the conditional 
probability equals their value for the model 
of A and B divided by their sum for all the 
models containing B. A complete partition 
for the patient problem might be 


— prostate cancer high PSA 1 
— prostate cancer — high PSA 999 
prostate cancer high PSA 2 


— high PSA fo) 


prostate cancer 


cer within the two possibilities of a high 
PSA (rows 1 and 3) yields the conditional 
probability: P(prostate cancer | high PSA) = 
2/3. Itis high, but far from 999/1000. 

Evolutionary psychologists postulate that 
natural selection led to an innate “mod- 
ule” in the mind that makes Bayesian in- 
ferences from naturally occurring frequen- 
cies. It follows that naive reasoners should 
fail the patient problem because it is about 
a unique event (Cosmides & Tooby, 1996; 
Gigerenzer & Hoffrage, 1995). In contrast, as 
the model theory predicts, individuals cope 
with problems about unique or repeated 
events provided they can use the subset prin- 
ciple and the arithmetic is easy (Girotto & 
Gonzalez, 2001). 

The model theory dispels some common 
misconceptions about probabilistic reason- 
ing. It is not always inductive. Extensional 
reasoning can be deductively valid, and it 
need not depend on a tacit knowledge of the 
probability calculus. It is not always correct 
because it can yield illusions (Table 9.4). 


Induction and Models 


Induction is part of everyday thinking (see 
Sloman & Lagnado, Chap. 5). Popper (1972) 
argued, however, that it is not part of sci- 
entific thinking. He claimed that science is 
based on explanatory conjectures, which ob- 
servations serve only to falsify. Some sci- 
entists agree (e.g., Deutsch, 1997, p. 159). 
However, many astronomical, meteorologi- 
cal, and medical observations are not tests 
of hypotheses. Everyone makes inductions 
in daily life. For instance, when the starter 
will not turn over the engine, your immedi- 
ate thought is that the battery is dead. You 
are likely to be right, but there is no guar- 
antee. Likewise, when the car ferry, Herald 
of Free Enterprise, sailed from Zeebrugge on 
March 6, 1987, its master made the plausi- 
ble induction that the bow doors had been 
closed. They had always been closed in the 
past, and there was no evidence to the con- 
trary. However, they had not been closed, 
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ple drowned. Induction is a common but 
risky business. 

The textbook definition of induction — 
alas, all too common — is that it leads from 
the particular to the general. Such argu- 
ments are indeed inductions, but many in- 
ductions such as the preceding examples 
are inferences from the particular to the 
particular. That is why the “Introduction” 
offered a more comprehensive definition: 
Induction is a process that increases semantic 
information. As an example, consider again 
the inference: 


The starter won’t turn. 
Therefore, the battery is dead. 


Like all inductions, it depends on knowledge 
and, in particular, on the true conditional: 


If the battery is dead, then the starter 
won't turn. 


It is consistent with the possibilities: 


battery dead 
— battery dead 
— battery dead 


— starter turn 
— starter turn 
starter turn 


The premise of the induction eliminates the 
third possibility, but the conclusion goes be- 
yond the information given because it elim- 
inates the second of them. The availability 
of the first model yields an intensional infer- 
ence of a high probability, but its conclusion 
rejects a real possibility. Hence, it may be 
false. Inductions are vulnerable because they 
increase semantic information. 

Inductions depend on knowledge. As 
Kahneman and Tversky (1982) showed, var- 
ious heuristics constrain the use of knowl- 
edge in inductions. The availability heuris- 
tic, illustrated in the previous example, re- 
lies on whatever relevant knowledge is avail- 
able (e.g., Tversky & Kahneman, 1973). The 
representativeness heuristic yields inferences 
dependent on the representative nature of 
the evidence (e.g., Kahneman & Frederick, 
2002; also see Kahneman & Frederick, Chap. 
12). The present account presupposes these 
heuristics but examines the role of models 


They are rapid, involuntary, and unconscious 
(see Litman & Reber, Chap. 18). Other in- 
ductions are explicit: They are slow, volun- 
tary, and conscious. This distinction is fa- 
miliar (e.g., Evans & Over, 1996; Johnson- 
Laird & Wason, 1977, p. 341; Sloman, 1996; 
Stanovich, 1999). The next part considers 
implicit inductions, and the part thereafter 
considers explicit inductions and the resolu- 
tion of inconsistencies. 


Implicit Induction and the Modulation 
of Models 


Semantics is central to models, and the con- 
tent of assertions and general knowledge can 
modulate models. Psychologists have pro- 
posed many theories about the mental rep- 
resentation of knowledge, but knowledge is 
about what is possible, and so the model the- 
ory postulates that it is represented in fully 
explicit models (Johnson-Laird & Byrne, 
2002). These models, in turn, modulate the 
mental models of assertions according to 


The principle of modulation: The meanings 
of clauses, coreferential links between them, 
general knowledge, and knowledge of con- 
text, can modulate the models of an asser- 
tion. In the case of inconsistency, meaning 
and knowledge normally take precedence 
over the models of assertions. 


Modulation can add information to mental 
models, prevent their construction, and flesh 
them out into fully explicit models. As an il- 
lustration of semantic modulation, consider 
the following conditional: 


If it’s a game, then it’s not soccer. 


Its fully explicit models (Table 9.2), if they 
were unconstrained by coreference and se- 
mantics, would be 


game — soccer 
— game — soccer 
— game soccer 


The meaning of the noun soccer entails that 
it is a game, and so an attempt to construct 
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an inconsistency. The conditional has only 
the first two models. 

The pragmatic effects of knowledge have 
been modeled in a computer program, 
which can be illustrated using the example 


If the match is struck properly, then it 
lights. 

The match is soaking wet and it is struck 
properly. 

What happens? 


In logic, it follows that the match lights, but 
neither people nor the program draws this 
conclusion. Knowledge that wet matches 
do not light overrides the model of the 
premises. The program constructs the men- 
tal model of the premises: 


match lights 
[the model of 


the premises] 


match 
struck 


match wet 


If a match is soaking wet, it does not light, 
and the program has a knowledge base con- 
taining this information in fully explicit 
models: 


= match lights 
= match lights 
match lights 


match wet 
= match wet 
= match wet 


The second premise states that the match is 
wet, which triggers the matching possibility 
in the preceding models: 

match wet = match lights 
The conjunction of this model with the 
model of the premises would yield a contra- 
diction, but the program follows the princi- 
ple of modulation and gives precedence to 
knowledge yielding the following model: 
match wet matchstruck — match lights 
and so the match does not light. The model 
of the premises also triggers another possi- 
bility from the knowledge base: 


— match wet = match lights 


premises are used to construct a counterfac- 
tual conditional: 


If it had not been the case that match wet 
and given match struck, then it might have 
been the case that match lights. 


Modulation is rapid and automatic, and 
it affects comprehension and reasoning 
(Johnson-Laird & Byrne, 2002; Newstead, 
Ellis, Evans, & Dennis, 1997; Ormerod & 
Johnson-Laird, in press). In logic, connec- 
tives such as conditionals and disjunctions 
are truth functional, and so the truth value 
of a sentence in which they occur can be 
determined solely from a knowledge of the 
truth values of the clauses they interconnect. 
However, in natural language, connectives 
are not truth functional: It is always nec- 
essary to check whether their content and 
context modulate their interpretation. 


Explicit Induction, Abduction, and the 
Creation of Explanations 


Induction is the use of knowledge to increase 
semantic information: Possibilities are elim- 
inated either by adding elements to a mental 
model or by eliminating a mental model al- 
together. After you have stood in line to no 
avail at a bar in Italy, you are likely to make 
an explicit induction: 


In Italian bars with cashiers, you pay the 
cashier first and then take your receipt to 
the bar to make your order. 


This induction is a general description. You 
may also formulate an explanation: 


The barmen are too busy to make change, 
and so it is more efficient for customers to 
pay a cashier. 


Scientific laws are general descriptions of 
phenomena (eg., Kepler’s third law de- 
scribes the elliptical orbits of the planets). 
Scientific theories explain these regularities 
in terms of more fundamental considerations 
(e.g., the general theory of relativity explains 
planetary orbits as the result of the sun’s 
mass curving space-time). Peirce (1903) 
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duction. In terms of fhe five categories of the 
“Introduction,” abduction is creative when it 
leads to the revision of beliefs. 

Consider the following problem: 


If a pilot falls from a plane without a 
parachute, the pilot dies. This pilot did not 
die, however. Why not? 


Most people respond, for example, that 


The plane was on the ground. 
The pilot fell into a deep snow drift. 


Only a minority draws the logically valid 
conclusion: 


The pilot did not fall from the plane without 
a parachute. 


Hence, people prefer a causal explanation 
repudiating the first premise to a valid de- 
duction, albeit they may presuppose that 
the antecedent of the conditional is true. 
Granted that knowledge usually takes prece- 
dence over contradictory assertions, the ex- 
planatory mechanism should dominate the 
ability to make deductions. 

In daily life, the propensity to explain is 
extraordinary, as Tony Anderson and this 
author discovered when they asked partic- 
ipants to explain the inexplicable. The par- 
ticipants received pairs of sentences selected 
at random from separate stories: 


John made his way to a shop that sold TV 
sets. 
Celia had recently had her ears pierced. 


In another condition, the sentences were 
modified to make them coreferential: 


Celia made her way to a shop that sold TV 
sets. 
She had recently had her ears pierced. 


The participants’ task was to explain what 
was going on. They readily went beyond 
the given information to account for what 
was happening. They proposed, for example, 
that Celia was getting reception in her ear- 
rings and wanted the TV shop to investigate, 
that she wanted to see some new earrings on 
closed circuit TV, that she had won a bet 


jotpeH Ate iBaty Colby having her ears pierced and was spend- 


ing the money on a TV set, and so on. Only 
rarely were the participants stumped for an 
explanation. They were almost as equally 
ingenious with the sentences that were not 
coreferential. 

Abduction depends on knowledge, es- 
pecially of causal relations, which accord- 
ing to the model theory refer to tempo- 
rally ordered sets of possibilities (Goldvarg & 
Johnson-Laird, 2001; see Cheng & Buehner, 
Chapter 5.). An assertion of the form C 
causes E is compatible with three fully ex- 
plicit possibilities: 


C E 
aC E 
aC aE 


with the temporal constraint that E cannot 
precede C. An “enabling” assertion of the 
form C allows E is compatible with the three 
possibilities: 


C E 
C aE 
aC aE 


This account, unlike others, accordingly dis- 
tinguishes between the meaning and logical 
consequences of causes and enabling condi- 
tions (pace, e.g., Einhorn & Hogarth, 1978; 
Hart & Honoré, 1985; Mill, 1874). It also 
treats causal relations as determinate rather 
than probabilistic (pace, e.g., Cheng, 1997; 
Suppes, 1970). Experiments support both 
these claims: Participants listed the previous 
possibilities, and they rejected other cases 
as impossible, contrary to probabilistic ac- 
counts (Goldvarg & Johnson-Laird, 2001). 
Of course, when individuals induce a causal 
relation from a series of observations, they 
are influenced by relative frequencies. How- 
ever, on the present account, the mean- 
ing of any causal relation that they induce 
is deterministic. 

Given the cause from a causal relation, 
there is only one possible effect, as the pre- 
vious models show; however, given the ef- 
fect, there is more than one possible cause. 
Exceptions do occur (Cummins, Lubart, 
Alksnis, & Rist, 1991; Markovits, 1984), 
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explain why inferences from causes to ef- 
fects are more plausible than inferences from 
effects to causes. As Tversky and Kahneman 
(1982) showed, conditionals in which the 
antecedent is a cause such as 


A girl has blue eyes if her mother has blue 
eyes. 


are judged as more probable than condition- 
als in which the antecedent is an effect: 


The mother has blue eyes if her daughter 
has blue eyes. 


According to the model theory, when in- 
dividuals discover inconsistencies, they try to 
construct a model of a cause and effect that 
resolves the inconsistency. It makes possible 
the facts of the matter, and the belief that 
the causal assertion repudiates is taken to be 
a counterfactual possibility (in a comparable 
way to the modulation of models by knowl- 
edge). Consider, for example, the scenario: 


If the trigger is pulled then the pistol will fire. 
The trigger is pulled, but the pistol does not 
fire. Why not? 


Given 20 different scenarios of this form 
(in an unpublished study carried out by 
Girotto, Legrenzi, & Johnson-Laird), most 
explanations were causal claims that repu- 
diated the conditional. In two further ex- 
periments with the scenarios, the partici- 
pants rated the statements of a cause and 
its effect as the most probable explanations; 
for example, 


A prudent person had unloaded the pistol 
and there were no bullets in the chamber. 


The cause alone was rated as less probable, 
but as more probable than the effect alone, 
which in turn was rated as more probable 
than an explanation that repudiated the cat- 
egorical premise; for example, 


The trigger wasn't really pulled. 


The greater probability assigned to the con- 
junction of the cause and effect than to 
either of its clauses is an instance of the 


tion is in error judged to be more probable 
than its constituents (Tversky & Kahneman, 
1983). 

Abductions that resolve inconsistencies 
have been implemented in a computer pro- 
gram that uses a knowledge base to create 
causal explanations. Given the preceding ex- 
ample, the program constructs the mental 
models of the conditional: 


trigger pulled pistol fires 


The conjunction of the categorical assertion 
yields 


[the model of 


the premises] 


trigger pistol fires 


pulled 


That the pistol did not fire is inconsistent 
with this model. The theory predicts that 
individuals should tend to abandon their be- 
lief in the conditional premise because its 
one explicit mental model conflicts with the 
fact that the pistol did not fire (see Girotto, 
Johnson-Laird, Legrenzi, & Sonino, 2000, 
for corroborating evidence). Nevertheless, 
the conditional expresses a useful idealiza- 
tion, and so the program treats it as the basis 
for a counterfactual set of possibilities: 


trigger —pistol fires [the model of 
pulled the facts] 
trigger pistol fires [the models of 
pulled counterfactual 
possibilities] 


People know that a pistol without bullets 
does not fire, and so the program has in its 
knowledge base the models: 


— pistol fires 
— pistol fires 
pistol fires 


— bullets in pistol 
bullets in pistol 
bullets in pistol 


The model of the facts triggers the first 
possibility in this set, which modulates the 
model of the facts to create a possibility: 
— bullets in — pistol fires 
pistol 


trigger 
pulled 
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causal antecedent from another set of mod- 
els in the knowledge base, which explains 
the inconsistency: A person emptied the pis- 
tol and so it had no bullets. The counterfac- 
tual possibilities yield the claim: If the per 
son had not emptied the pistol, then it would 
have had bullets, and... it would have fired. 
The fact that the pistol did not fire has been 
used to reject the conditional premise, and 
available knowledge has been used to create 
an explanation and to modulate the condi- 
tional premise into a counterfactual. There 
are, of course, other possible explanations. 

In sum, reasoners can resolve inconsisten- 
cies between incontrovertible evidence and 
the consequences of their beliefs. They use 
their available knowledge - in the form of 
explicit models — to try to create a causal 
scenario that makes sense of the facts. Their 
reasoning may resolve the inconsistency, cre- 
ate an erroneous account, or fail to yield any 
explanation whatsoever. 


Conclusions and Further Directions 


Mental models have a past in the nineteenth 
century. The present theory was developed 
in the twentieth century. In its application to 
deduction, as Peirce anticipated, if a conclu- 
sion holds in all the models of the premises, 
it is necessary given the premises. If it holds 
in a proportion of the models, then, granted 
that they are equiprobable, its probability 
is equal to that proportion. If it holds in 
at least one model, then it is possible. The 
theory also applies to inductive reasoning — 
both the rapid implicit inferences that un- 
derlie comprehension and the deliberate in- 
ferences yielding generalizations. It offers an 
account of the creation of causal explana- 
tions. However, if Craik was right, mental 
models underlie all thinking with a proposi- 
tional content, and so the present theory is 
radically incomplete. 

What of the future of mental models? The 
theory is under intensive development and 
intensive scrutiny. It has been corroborated 
in many experiments, and it is empirically 
distinguishable from other theories. Indeed, 


ory itself (see, e.g., Evans, 1993; Ormerod, 
Manktelow, & Jones, 1993; Polk & Newell, 
1995). The most urgent demands for the 
twenty-first century are the extension of the 
theory to problem solving, decision making, 
and strategic thinking when individuals com- 
pete or cooperate. 
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CHAPTER 10 


Visuospatial Reasoning 


Barbara Tversky 


Visuospatial reasoning is not simply a mat- 
ter of running to retrieve a fly ball or wend- 
ing a way through a crowd or plotting a 
path to a destination or stacking suitcases 
in a car trunk. It is a matter of deter 
mining whether gears will mesh (Schwartz 
& Black, 1996a), understanding how a car 
brake works (Heiser & Tversky, 2002), dis- 
covering how to destroy a tumor without de- 
stroying healthy tissue (Duncker, 1945; Gick 
& Holyoak, 1980, 1983), and designing a 
museum (Suwa & Tversky, 1997). Perhaps 
more surprising, it is also a matter of decid- 
ing whether a giraffe is more intelligent than 
a tiger (Banks & Flora, 1977; Paivio, 1978), 
whether one event is later than another 
(Boroditsky, 2000), and whether a conclu- 
sion follows logically from its premises (Bar- 
wise & Etchemendy, 1995; Johnson-Laird, 
1983). All these abstract inferences, and 
more, appear to be based on spatial reason- 
ing. Why is that? People begin to acquire 
knowledge about space and the things in it 
probably before they enter the world. In- 
deed, spatial knowledge is critical to survival 
and spatial inference critical to effective sur- 
vival. Perhaps because of the (literal) ubiq- 


uity of spatial reasoning, perhaps because 
of the naturalness of mapping abstract el- 
ements and relations to spatial ones, spatial 
reasoning serves as a basis for abstract knowl- 
edge and inference. The prevalence of spa- 
tial figures of speech in everyday talk attests 
to that: We feel close to some people and 
remote from others; we try to keep our spir- 
its up, to perform at the peak of our pow- 
ers, to avoid falling into depressions, pits, 
or quagmires; we enter fields that are wide 
open, struggling to stay on top of things and 
not get out of depth. Right now, in this sec- 
tion, we establish fuzzy boundaries for the 
current field of inquiry. 


Reasoning 


Before the research, a few words about the 
words are in order. The core of reasoning 
seems to be, as Bruner put it years ago, go- 
ing beyond the information given (Bruner, 
1973). Of course, nearly every human ac- 
tivity requires going beyond the information 
given. The simplest recognition or general- 
ization task, as well as the simplest action, 
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given, for, according to a far more ancient 
saying, you never step into the same river 
twice. Yet many of these tasks and actions 
do not feel cognitive, do not feel like reason- 
ing. However, the border between percep- 
tual and cognitive processes may be harder 
to establish than the borders between coun- 
tries in conflict. Fortunately, psychology is 
typically free of territorial politics, and so 
establishing boundaries between perception 
and cognition is not essential. There seems to 
be a tacit understanding as to what counts as 
perceptual and what as cognitive, although 
for these categories just as for simpler ones, 
such as chairs and cups, the centers of the 
category enjoy more consensus than the bor 
ders. Invoking principles or requirements for 
the boundaries between perception and cog- 
nition — consciousness, for example — seems 
to entail more controversy than the separa- 
tion into territories. 

How do we go beyond the information 
given? Going beyond the information given 
does not necessarily mean adding informa- 
tion. One way to go beyond the information 
given is to transform the information given. 
This is the concern of the earlier part of the 
manuscript. Going beyond the information 
given can also mean transforming the given 
information, sometimes according to rules, 
as in deductive reasoning. Another way to 
go beyond the information given is to make 
inferences or judgments from it. Inference 
and judgment are the concerns of the later 
part of the manuscript. Now some more dis- 
tinctions regarding the visuospatial portion 
of the title are made. 


Representations and Transformations 


Truths are hard to come by in science, 
but useful fictions and approximate truths 
abound. One of these is the distinction 
between representations and transforma- 
tions, between information and processes, 
between data and the operations performed 
on data. Representations place limits on 
transformations as they select and structure 
the information captured from the world 
or the mind. Distinguishing representations 


servation of the brain, is another distinction 
fraught with complexity and controversy. 
Evidence brought to bear for one can fre- 
quently be reinterpreted as evidence for the 
other (e.g., Anderson, 1978). Both represen- 
tations and transformations themselves can 
each be decomposed into representations 
and transformations. Despite these compli- 
cations, the distinction has been a productive 
way to think about psychological processes. 
In fact, it is a distinction that runs deep 
in human cognition, captured in language 
as subject and predicate and in behavior as 
agent/object and action. The distinction will 
prove useful here more than as a way of or- 
ganizing the literature (for related discus- 
sion, see Doumas & Hummel, Chap. 4). 

It has been argued that the very estab- 
lishment of representations entails inferen- 
tial operations. A significant example is the 
Gestalt principles of perceptual organiza- 
tion — grouping by similarity, proximity, 
common fate, and good continuity — that 
contribute to scene segmentation and rep- 
resentation. These are surely a form of vi- 
suospatial inference. Representations are in- 
ternal translations of external stimuli (or 
internal data); as such, they not only elimi- 
nate information from the external world — 
they also add to it and distort it in the ser 
vice of interpretation or behavior. Thus, if 
inference is to be understood in terms of 
operating on or manipulating information 
to draw new conclusions, then it begins in 
the periphery of the sensory systems with 
leveling and sharpening and feature detec- 
tion and organization. Nevertheless, the field 
has accepted a level of description of repre- 
sentations and transformations — one higher 
than the levels of sensory and perceptual 
processing; that level is reflected here. 


Visuospatial 


What makes visuospatial representations 
visuospatial? Visuospatial transformations 
visuospatial? First and foremost, visuospatial 
representations capture visuospatial proper- 
ties of the world. They do this in a way 
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structural relations of that information (see 
Johnson-Laird, 1983; Pierce in Houser & 
Kloesel, i992). This means that visuospa- 
tial properties that are close or above or be- 
low in the world preserve those relations 
in the representations. Visual includes static 
properties of objects, such as shape, texture, 
and color, or between objects and reference 
frames, such as distance and direction. It also 
includes dynamic properties of objects such 
as direction, path, and manner of movement. 
By this account, visuospatial transformations 
are those that change or use visuospatial in- 
formation. Many of these properties of static 
and dynamic objects and of spatial relations 
between objects are available from modal- 
ities other than vision. This may explain 
why well-adapted visually impaired individ- 
uals are not disadvantaged at many spatial 
tasks (e.g., Klatzky, Golledge, Cicinelli, & 
Pellegrino, 1995). Visuospatial representa- 
tions are regarded as contrasting with other 
forms of representation — notably linguis- 
tic. The similarities (e.g., Talmy, 1983, 2001) 
and differences between visuospatial and 
linguistic representations provide insights 
into both. 

Demonstrating properties of internal rep- 
resentations and transformations is tricky for 
another reason; representations are many 
steps from either (controlled) input or 
(observed) output. For these reasons, the 
study of internal representations and pro- 
cesses was eschewed not only by behavior 
ists but also by experimentalists. It was one 
of the first areas to flourish after the so- 
called Cognitive Revolution of the 1960s 
with a flurry of innovative techniques to 
demonstrate form and content of internal 
representations and the transformations per- 
formed on them. Itis to that research that we 
now turn. 


Representations and Transformations 


Visuospatial reasoning can be approached 
bottom-up by studying the elementary rep- 
resentations and processes that presumably 
form the building blocks for more com- 
plex reasoning. It can also be approached 


that has a visuospatial basis. Both ap- 
proaches have been productive. We begin 
with elements. 


Imagery as Internalized Perception 


The major research tradition studying visu- 
ospatial reasoning from a bottom-up per 
spective has been the imagery program pi- 
oneered by Shepard (see Finke & Shepard, 
1986; Shepard & Cooper, 1982; Shepard & 
Podgorny, 1978, for overviews) and Kosslyn 
(1980, 1994b), which has aimed to demon- 
strate parallels between visual perception 
and visual imagery. There are two basic 
tenets of the approach, one regarding rep- 
resentations and the other regarding opera- 
tions on representations: that mental images 
resemble percepts and that mental trans- 
formations on images resemble observable 
changes in things in the world, as in men- 
tal rotation, or perceptual processes per- 
formed on things in the world, as in men- 
tal scanning. Kosslyn (1994b) has persisted 
in these aims, more recently demonstrat- 
ing that many of the same neural structures 
are used for both. Not the demonstrations 
per se, but the interpretations of them have 
met with controversy (e.g., Pylyshyn, 1978, 
1981). In attempting to demonstrate the sim- 
ilarities between imagery and perception, 
the imagery program has focused both on 
properties of objects and on characteristics 
of transformations on objects — the former, 
representations, and the latter, operations or 
transformations. The thrust of the research 
programs has been to demonstrate that im- 
ages are like internalized perceptions and 
transformations of images like transforma- 
tions of things in the world. 


REPRESENTATIONS 


In the service of demonstrating that im- 
ages preserve characteristics of perceptions, 
Shepard and his colleagues brought evi- 
dence from similarity judgments as sup- 
port. They demonstrated “second-order 
isomorphisms,” similarity spaces for per- 
ceived and imagined stimuli that have the 
same structure, that is, are fit by the 
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(Shepard & Chipman, 1970). For example, 
similarity judgments of shapes of cutouts 
of states conform to the same multidimen- 
sional space as similarity judgments of imag- 
ined shapes of states. The same logic was 
used to show that color is preserved in im- 
ages, as well as configurations of faces (see 
Gordon & Hayward, 1973; Shepard, 1975). 
Similar reasoning was used to demonstrate 
qualitative differences between pictorial and 
verbal representations in a task requiring se- 
quential same—different judgments on pairs 
of schematic faces and names (Tversky, 
1969). The pictorial and verbal similarity 
of the set of faces was orthogonal so the 
“different” responses were a clue to the un- 
derlying representation; times to respond 
“different” were faster when more features 
between the pairs differ. These times indi- 
cated that when participants expected the 
target (second) stimulus would be a picture, 
they encoded the first stimulus pictorially, 
whether it had been a picture of a face or 
its name. The converse also held: When the 
target stimulus was expected to be a name, 
participants coded the first stimulus verbally 
irrespective of its presented modality. 

To demonstrate that mental images pre- 
serve properties of percepts, Kosslyn and his 
colleagues presented evidence from studies 
of reaction times to detect features of imag- 
ined objects. One aim is to show that prop- 
erties that take longer to verify in percepts 
take longer to identify in images. For ex- 
ample, when participants were instructed to 
construct images of named animals in order 
to judge whether the animal had a partic- 
ular part, they verified large parts of ani- 
mals, such as the back of a rabbit, faster 
than small but highly associated ones, such 
as the whiskers of a rat. When participants 
were not instructed to use imagery to make 
judgments, they verified small associated 
parts faster than large ones. When not in- 
structed to use imagery, participants used 
their general world knowledge to make judg- 
ments (Kosslyn, 1976). Importantly, when 
the participants explicitly used imagery, they 
took longer to verify parts, large or small, 
than when they relied on world knowledge. 


preserve properties of percepts comes from 
tasks requiring construction of images. Con- 
structing images takes longer when there are 
more parts to the image, even when the 
same figure can be constructed from more 
or fewer parts (Kosslyn, 1980). 

The imagery-as-internalized-perception 
has proved to be too narrow a view of the 
variety of visuospatial representations. In ac- 
counting for syllogistic reasoning, Johnson- 
Laird (1983) proposed that people form 
mental models of the situations described 
by the propositions (see Johnson-Laird, 
Chap. 9). Mental models contrast with clas- 
sic images in that they are more schematic 
than classical images. Entities are repre- 
sented as tokens, not as likenesses, and 
spatial relations are approximate, almost 
qualitative. A similar view was developed 
to account for understanding text and dis- 
course, then listeners and readers construct 
schematic models of the situations described 
(e.g., Kintsch & van Dijk, 1983; Zwaan & 
Radvansky, 1998). As is seen, visuospatial 
mental representations of environments, de- 
vices, and processes are often schematic, 
even distorted, rather than detailed and ac- 
curate internalized perceptions. 


TRANSFORMATIONS 


Here, the logic is the same for most research 
programs and in the spirit of Shepard’s 
notion of second-order isomorphisms: to 
demonstrate that the times to make par- 
ticular visuospatial judgments in memory 
increase with the times to observe or per- 
form the transformations in the world. The 
dramatic first demonstration was mental ro- 
tation (Shepard & Metzler, 1971): time to 
judge whether two figures in different ori- 
entations (Figure 10.1) are the same or 
mirror images correlate linearly with the an- 
gular distance between the orientations of 
the figures. The linearity of the relation- 
ship — 12 points on a straight line — suggests 
smooth, continuous mental transformation. 
Although linear functions have been ob- 
tained for the original stimuli, strings of 
10 cubes with two bends, monotonic, but not 
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Figure 10.1. Mental rotation task of Shepard 
and Metzler (1971). Participants determine 
whether members of each pair can be rotated 
into congruence. 


linear, functions are obtained for other stim- 
uli such as letters (Shepard & Cooper, 1982). 
There are myriad possible mental transfor- 
mations, only a few of which have been stud- 
ied in detail. They may be classified into 
mental transformations on other objects and 
individuals, and mental transformations on 
oneself. In both cases, the transformations 
may be global, wholistic, or of the entire en- 
tity — the transformations may be operations 
on parts of entities. 


Mental Transformations on Objects. Ro- 
tation is not the only transformation that 
objects in the world undergo. They can 
undergo changes of size, shape, color, in- 
ternal features, position, combination, and 
more. Mental performance of some of these 
transformations has been examined. The 
time to mentally compare the shapes of 
two rectangles differing in size increases as 
the actual size difference between them in- 


ynwOltreases (Bundesen, Larsen, & Farrell, 1981; 


Moyer, 1973). New objects can be con- 
structed in imagery, which is a skill presum- 
ably related to design and creativity (eg., 
Finke, 1990, 1993). In a well-known exam- 
ple, Finke, Pinker, and Farah (1989) asked 
students to imagine a capital letter J centered 
under an upside-down grapefruit half Stu- 
dents reported “seeing” an umbrella. Even 
without instructions to image, certain tasks 
spontaneously encourage formation of vi- 
sual images. For example, when participants 
are asked whether a described spatial array, 
such as star above plus, matches a depicted 
one, response times indicate that they trans- 
form the description into a depiction when 
given sufficient time to mentally construct 
the situation (Glushko & Cooper, 1978; 
Tversky, 1975). 

In the cases of mental rotation, mental 
movement, and mental size transformations, 
objects or object parts undergo imagined 
transformations. There is also evidence that 
objects can be mentally scanned in a contin- 
uous manner. In a popular task introduced 
by Kosslyn and his colleagues, participants 
memorize a map of an island with several 
landmarks such as a well and a cave. Partic- 
ipants are then asked to conjure an image 
of the map and to imagine looking first at 
the well and then mentally scanning from 
the well to the cave. The general finding is 
that mental scanning between two imagined 
landmarks increases linearly as the distance 
between them increases (Denis & Kosslyn, 
1999; Kosslyn, Ball, & Rieser, 1978; Fig- 
ure 10.2). The phenomenon holds for spa- 
tial arrays established by description rather 
than depiction — again, under instructions to 
form and use images (Denis, 1996). Men- 
tal scanning occurs for arrays in depth and 
for flat perspectives on 3D arrays (Pinker, 
1980). In the previous studies, participants 
were trained to mentally scan and directed to 
do so, leaving open the question of whether 
it occurs spontaneously. It seems to do so in 
a task requiring direction judgments on re- 
membered arrays. Participants first saw an 
array of dots. After the dots disappeared, 
an arrow appeared on the screen. The task 
was to say whether the arrow pointed to 


214 THE CAMBRIDGE HANDBOOK OF THINKING AND REASONING 


the previouP leeaReA assy: ites Rea haianaryxtarmspatial reasoning has not only percep- 


times increased with distance of the arrow 
to the likely dot, suggesting that participants 
mentally scan from the arrow to answer 
the question (Finke & Pinker, 1982, 1983). 
Mental scanning may be part of catching 
or hitting the ball in baseball, tennis, and 
other sports. 


Applying Several Mental Transformations. 
Other mental transformations on objects are 
possible — for example, altering the internal 
configuration of an object. To solve some 
problems, such as geometric analogies, peo- 
ple need to apply more than one mental 
transformation to a figure to obtain the an- 
swer. In most cases, the order of applying 
the transformations is optional; that is, first 
rotating and then moving a figure yield the 
same answer as first moving and then rotat- 
ing. Nevertheless, people have a preferred 
order for performing a sequence of mental 
transformations, and when this order is vi- 
olated, both errors and performance time 
increase (Novick & Tversky, 1987). What 
accounts for the preferred order? Although 
the mental transformations are performed 
in working memory, the determinants of or- 
der do not seem to be related to working 
memory demands. Move is one of the least 
demanding transformations, and it is typi- 
cally performed first, whereas rotate is one 
of the most difficult transformations and is 
performed second. Then transformations of 
intermediate difficulty are performed. What 
correlates with the order of applying succes- 
sive mental transformations is the order of 
drawing. Move determines where the pencil 
is to be put on the paper, the first act of draw- 
ing. Rotate determines the direction in which 
the first stroke should be taken, and it is the 
next transformation. The next transforma- 
tions to be applied are those that determine 
the size of the figure and its internal details 
(remove, add part, change size, change shading, 
add part). Although the mental transforma- 
tions have been tied to perceptual processes, 
the ordering of performing them appears 
to be tied to a motor process, the act of 
drawing or constructing a figure. This finding 
presaged later work showing that complex 


tual, but also motor, foundations. 


Mental Transformations of Self. That men- 
tal imagery is both perceptual and motor 
follows from broadening the basic tenets of 
the classical account for imagery. According 
to that account, mental processes are inter- 
nalizations of external or externally driven 
processes — perceptual ones according to 
the classic view (e.g., in the chapter title of 
Shepard & Podgorny, 1978, “Cognitive pro- 
cesses that resemble perceptual processes”). 
The acts of drawing a figure or construct- 
ing an object entail both perceptual and 
motor processes working in concert as do 
many other activities performed in both real 
and virtual worlds, from shaking hands to 
wayfinding. 

Evidence for mental transformations of 
self, or motor imagery, rather than or in addi- 
tion to visual imagery has come from a vari- 
ety of tasks. The time taken to judge whether 
a depicted hand is right or left correlates 
with the time taken to move the hand into 
the depicted orientation as if participants 
were mentally moving their hands in or- 
der to make the right/left decision (Parsons, 
1987b; Sekiyama, 1982). Mental reorienta- 
tion of one’s body has been used to ac- 
count for reaction times to judge whether 
a left or right arm is extended in pictures 
of bodies in varying orientations from up- 
right (Parsons, 1987a). In those studies, re- 
action times depend on the angle of rotation 
and the degree of rotation. For some orienta- 
tions, notably the picture plane, the degree 
of rotation from upright has no effect. This 
allows dissociating mental transformations 
of other, in this case, mental rotation from 
mental transformations of self, in this case, 
perspective transformations, for the latter do 
yield increases in reaction times with de- 
gree of rotation from upright (Zacks, Mires, 
Tversky, & Hazeltine, 2000; Zacks & Tver- 
sky, in press). Imagining oneself interacting 
with a familiar object such as a ball or a ra- 
zor selectively activates left inferior parietal 
and sensorimotor cortex, whereas imagining 
another interacting with the same objects 
selectively activates right inferior parietal, 


VISUOSPATIAL REASONING 215 


Figure 10.2. Mental scanning. Participants 
memorize map and report time to mentally scan 
from one feature to another (after Kosslyn, Ball, 
& Rieser, 1978). 


precuneus, posterior cingulated, and fron- 
topolar cortex (Ruby & Decety, 2001). 
There have been claims that visual and 
motor imagery, or as we have put it, mental 
transformations of object and of self, share 
the same underlying mechanisms (Wexler, 
Kosslyn, & Berthoz, 1998; Wolschlager & 
Wolschlager, 1998). For example, perform- 
ing clockwise physical rotations facilitates 
performing clockwise mental rotations but 
interferes with performing counterclock- 
wise mental rotations. However, this may 
be because planning, performing, and mon- 
itoring the physical rotation require both 
perceptual and motor imagery. The work 
of Zacks and collaborators (Zacks et al., 
2000; Zacks & Tversky, in press) and Ruby 
and Decety (2001) suggests that these two 
classes of mental transformations are disso- 
ciable. Other studies directly comparing the 
two systems support their dissociability: The 
consequences of using one can be different 
from the consequences of using the other 
(Schwartz, 1999; Schwartz & Black, 1999; 
Schwartz & Holton, 2000). When people 
imagine wide and narrow glasses filled to the 
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first when tilted, they are typically incorrect 
from visual imagery. However, if they close 
their eyes and imagine tilting each glass un- 
til it spills, they correctly tilt a wide glass 
less than a narrow one (Schwartz & Black, 
1999). Think of turning a car versus turning a 
boat. To imagine making a car turn right, you 
must imagine rotating the steering wheel to 
the right; however, to imagine making a boat 
turn right, you must imagine moving the 
rudder lever left. In mental rotation of left 
and right hands, the shortest motor path ac- 
counts for the reaction times better than the 
shortest visual path (Parsons, 1987b). Men- 
tal enactment also facilitates memory, even 
for actions described verbally (Englekamp, 
1998). Imagined motor transformations pre- 
sumably underlie mental practice of athletic 
and musical routines — techniques known 
to benefit performance (eg., Richardson, 
1967). 

The reasonable conclusion, then, is that 
both internalized perceptual transforma- 
tions and internalized motor transformations 
can serve as bases for transformations in 
mental imagery. Perceptual and motor im- 
agery can work in concert in imagery, just 
as perceptual and motor processes work in 
concert in conducting the activities of life. 


ELEMENTARY TRANSFORMATIONS 
The imagery-as-internalized-perception ap- 
proach has provided evidence for myriad 
mental transformations. We have reviewed 
evidence for a number of mental per- 
ceptual transformations: scanning, changing 
orientation, location, size, shape, color; con- 
structing from parts; and rearranging parts. 
Then we have motor transformations: mo- 
tions of bodies, wholes, or parts. This ap- 
proach has the potential to provide a catalog 
of elementary mental transformations that 
are simple inferences and that can combine 
to enable complex inferences. 

The work on inference, judgment, and 
problem solving will suggest transformations 
that have yet to be explored in detail. Here, 
we propose a partial catalog of candidates 
for elementary properties of representations 
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search reviewed: 


° Determining static properties of entities: 
figure/ground, symmetry, shape, internal 
configuration, size, color, texture, and 
more 


* Determining relations between static 
entities: 
o With respect to a frame of reference: 


location, direction, distance, and more 


o With respect to other entities, com- 
paring size, color, shape, texture, loca- 
tion, orientation, similarity, and other 
attributes 


° Determining relations of dynamic and static 
entities: 
o With respect to other entities or 


to a reference frame: direction, 
speed, acceleration, manner, intersec- 
tion/collision 


° Performing transformations on_ entities: 
change location (scanning); change per- 
spective, orientation, size, shape; mov- 
ing wholes; reconfiguring parts; zooming; 
enacting 

° Performing transformations on self: change 
of perspective, change of location, 
change of size, shape, reconfiguring parts, 
enacting 


INDIVIDUAL DIFFERENCES 


Yes, people vary in spatial ability. However, 
spatial ability does not contrast with ver- 
bal ability; in other words, someone can be 
good or poor at both, as well as good in one 
and poor in the other. In addition, spatial 
ability (like verbal ability) is not a single, 
unitary ability. Some of the separate spa- 
tial abilities differ qualitatively; that is, they 
map well onto the kinds of mental transfor- 
mations they require. A meta-analysis of a 
number of factor analyses of spatial abili- 
ties yielded three recurring factors (Linn & 
Peterson, 1986): spatial perception, spatial 
visualization, and mental rotation. Rod-and- 
frame and water-level tasks load high on spa- 
tial perception; this factor seems to reflect 
choice of frame of reference, within an ob- 
ject or extrinsic. Performance on embedded 


plex ones, loads high on spatial visualization, 
and performance on mental rotation tasks 
naturally loads high on the mental rotation 
factor. As frequently as they are found, these 
three abilities do not span the range of spa- 
tial competencies. Yet another partially in- 
dependent visuospatial ability is visuospatial 
memory, remembering the layout of display 
(e.g., Betrancourt & Tversky, in press). The 
number of distinct spatial abilities as well as 
their distinctness remain controversial (e.g., 
Carroll, 1993; Hegarty & Waller, in press). 

More recent work explores the relations 
of spatial abilities to the kinds of men- 
tal transformations that have been distin- 
guished — for example, imagining an object 
rotate versus imagining changing one’s own 
orientation. The mental transformations, in 
turn, are often associated with different 
brain regions (e.g., Zacks, Mires, Tversky, & 
Hazeltine, 2000; Zacks, Ollinger, Sheridan, 
& Tversky, 2002; Zacks & Tversky, in 
press). Kozhevniikov, Kosslyn, and Shepard 
in press) proposed that spatial visualiza- 
tion and mental rotation correspond respec- 
tively to the two major visual pathways in 
the brain — the ventral “what” pathway un- 
derlying object recognition and the dorsal 
“where” pathway underlying spatial loca- 
tion. Interestingly, scientists and engineers 
score relatively high on mental rotation and 
artists score relatively high on spatial visu- 
alization. Similarly, architects and design- 
ers score higher than average on embed- 
ded figure tasks but not on mental rota- 
tion (Suwa & Tversky, 2003). Associating 
spatial ability measures to mental transfor- 
mations and brain regions are promising 
directions toward a systematic account of 
spatial abilities. 


Inferences 


Inferences from Observing Motion 
in Space 


To ensure effective survival, in addition to 
perceiving the world as it is we need to 
also anticipate the world that will be. This 
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tial information. Some common inferences, 
such as determining where to intersect a fly- 
ing object — in particular, a fly ball (eg, 
McBeath, Shaffer, & Kaiser, 1995) — or what 
moving parts belong to the same object (e.g., 
Spelke, Vishton, & von Hofsten, 1995) are 
beyond the scope of the chapter. From sim- 
ple, abstract motions of geometric figures, 
people, even babies, infer causal impact and 
basic ontological categories — notably, inani- 
mate and animate. A striking demonstration 
of perception of causality comes from the 
work of Michotte (1946/1963; see Buehner 
& Cheng, Chap. 7). Participants watch films 
of a moving object, A, coming into contact 
with a stationary object, B. When object B 
moves immediately, continuing the direc- 
tion of motion suggested by object A, people 
perceive A as launching B, A as causing B to 
move. When A stops so both A and B are 
stationary before B begins to move, the per- 
ception of a causal connection between A’s 
motion and B’s is lost; their movements are 
seen as independent events. This is a forceful 
demonstration of immediate perception of 
causality from highly abstract actions, as well 
as of the conditions for perception of causal- 
ity. What seems to underlie the perception 
of causality is the perception that object A 
acts on object B. Actions on objects turn out 
to be the basis for segmenting events into 
parts (Zacks, Tversky, & Iyer, 2001). 

In Michotte’s (1946/1963) demonstra- 
tions, the timing of the contact between the 
initially moving object and the stationary ob- 
ject that begins to move later is critical. If 
A stops moving considerably before B be- 
gins to move, then B’s motion is perceived 
to be independent of A’s. B’s movement 
in this case is seen as self-propelled. Self- 
propelled movement is possible only for ani- 
mate agents, or, more recently in the history 
of humanity, for machines. Possible paths 
and trajectories of animate motion differ 
from those for inanimate motion. Preschool 
children can infer which motion paths are 
appropriate for animate and inanimate mo- 
tion, and even for abstract stimuli; they also 
offer sensible explanations for their infer- 
ences (Gelman, Durgin, & Kaufman, 1995). 


make further inferences about what gen- 
erated the motion. In point-light films, 
the only thing visible is the movement 
of lights placed at motion junctures of 
for example, the joints of people walk- 
ing or along branches of bushes swaying. 
From point-light films, people can determine 
whether the motion is walking, running, or 
dancing, of men or of women, of friends 
(Cutting & Kozlowski, 1977; Johannson, 
1973; Kozlowski & Cutting, 1977), of bushes 
or trees (Cutting, 1986). Surprisingly, from 
point-light displays of action, people are bet- 
ter at recognizing their own movements than 
those of friends, suggesting that motor ex- 
perience contributes to perception of mo- 
tion (Prasad, Loula, & Shiffrar, 2003). Even 
abstract films of movements of geometric 
figures in sparse environments can be inter- 
preted as complex social interactions, such 
as chasing and bullying, when they are espe- 
cially designed for that (Heider & Simmel, 
1944; Martin & Tversky, 2003; Oatley & 
Yuill, 1985) or playing hide-and-seek, but in- 
terpreting these as intentional actions is not 
immediate; rather, it requires repeated ex- 
posure and possibly instructions to interpret 
the actions (Martin & Tversky, 2003). 

Altogether, simply from abstract mo- 
tion paths or animated point-light displays, 
people can infer several basic ontological 
categories: causal action, animate versus 
inanimate motion, human motion, motion 
of males or females and familiar individuals, 
and social interactions. 


Mental Spatial Inferences 


INFERENCES IN REAL ENVIRONMENTS 


Every kid who has figured out a short-cut, 
and who has not, has performed a spatial 
inference (for a more recent overview of 
kids, see Newcombe & Huttenlocher, 2000). 
Some of these inferences turn out to be 
easier than others, often surprisingly. For 
example, in real environments, inferences 
about where objects will be in relationship 
to oneself after imagined movement in the 
environment turn out to be relatively ac- 
curate when the imagined movement is a 
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backward aligned with the body. How- 
ever, if the imagined movement is rota- 
tional, a change in orientation, updating is 
far less accurate (e.g., Presson & Montello, 
1994; Reiser, 1989). When asked to imagine 
walking forward a certain distance, turning, 
walking forward another distance, and then 
pointing back to the starting point, partic- 
ipants invariably err by not taking into ac- 
count the turn in their pointing (Klatzky, 
Loomis, Beall, Chance, & Golledge, 1998). If 
they actually move forward, turn, and con- 
tinue forward, but blindfolded, they point 
correctly. Spatial updating in real environ- 
ments is more accurate after translation than 
after rotation, and updating after rotation 
is selectively facilitated by physical rotation. 
This suggests a deep point about spatial in- 
ferences and possibly other inferences: that 
in inference, mental acts interact with phys- 
ical acts. 


GESTURE 


Interaction of mind and body in inference is 
also revealed in gesture. When people de- 
scribe space but are asked to sit on their 
hands to prevent gesturing, their speech fal- 
ters (Rauscher, Krauss, & Chen, 1996), sug- 
gesting that the acts of gesturing promote 
spatial reasoning. Even blind children ges- 
ture as they describe spatial layouts (Iverson 
& Goldin-Meadow, 1997). 

The nature of spontaneous gestures sug- 
gests how this happens. When describing 
continuous processes, people make smooth, 
continuous gestures; when describing dis- 
crete ones, people make jagged, discontin- 
uous ones (Alibali, Bassok, Solomon, Syc, 
& Goldin-Meadow, 1999). For space, peo- 
ple tend to describe environments as if they 
were traveling through them or as if they 
were viewing them from above. The plane 
of their gestures differs in each case in cor- 
respondence with the linguistic perspective 
they adopt (Emmorey, Tversky, & Taylor, 
2000). Earlier, mental transformations that 
appear to be internalized physical transfor- 
mations, such as those underlying handed- 
ness judgments, were described. Here, we 


reflect the character of mental ones. 


INFERENCES IN MENTAL ENVIRONMENTS 


The section on inference opened with spa- 
tial inferences made in real environments. 
Often, people make inferences about envi- 
ronments they are not currently in, for ex- 
ample, when they tell a friend how to get 
to their house and where to find the key 
when they arrive. For familiar environments, 
people are quite competent at these sorts 
of spatial inferences. The mental represen- 
tations and processes underlying these in- 
ferences have been studied for several kinds 
of environments — notably the immediately 
surrounding visible or tangible environment 
and the environment too large to be seen 
at a glance. These two situations, the space 
around the body, and the space the body 
navigates, seem to function differently in our 
lives, and consequently, to be conceptualized 
differently (Tversky, 1998). 

Spatial updating for the space around the 
body was first studied using language alone 
to establish the environments (Franklin & 
Tversky, 1990). It is significant that lan- 
guage alone, with no specific instructions 
to form images, was sufficient to establish 
mental environments that people could up- 
date easily and without error. In the proto- 
typical spatial framework task, participants 
read a narrative that describes themselves 
in a 3D spatial scene, such as a museum 
or hotel lobby (Franklin & Tversky, 1990; 
Figure 10.3). The narrative locates and de- 
scribes objects appropriate to the scene be- 
yond the observer’s head, feet, front, back, 
left, and right (locations chosen randomly). 
After participants have learned the scenes 
described by the narratives, they turn to a 
computer that describes them as turning in 
the environment so they are now facing a dif- 
ferent object. The computer then cues them 
with direction terms, front, back, head, and so 
on, to which the participants respond with 
the name of the object now in that direc- 
tion. Of interest are the times to respond, 
depending on the direction from the body. 
The classical imagery account would predict 
that participants will imagine themselves in 
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Figure 10.3. Spatial framework situation. 
Participants read a narrative describing objects 
around an observer (after Bryant, Tversky, & 
Franklin, 1992). 


the environment facing the selected object 
and then imagine themselves turning to face 
each cued object in order to retrieve the ob- 
ject in the cued direction. The imagery ac- 
count predicts that reaction times should be 
fastest to the object in front, then to the ob- 
jects 90 degrees away from front, that is, left, 
right, head, and feet, and slowest to objects 
180 degrees from front, that is, objects to the 
back. Data from dozens of experiments fail 
to support that account. 

Instead, the data conform to the spatial 
framework theory according to which partic- 
ipants construct a mental spatial framework 
from extensions of three axes of the body: 
head/feet, front/back, and left/right. Times 
to access objects depend on the asymmetries 
of the body axes as well as the asymmetries 
of the axes of the world. The front/back and 
head/feet axes have important perceptual 
and behavioral asymmetries that are lacking 
in the left/right axis. The world also has three 
axes, only one of which is asymmetric, the 
axis conferred by gravity. For the upright ob- 
server, the head/feet axis coincides with the 
axis of gravity, and so responses to head and 
feet should be fastest, and they are. Accord- 
ing to the spatial framework account, times 
should be next fastest to the front/back axis 


Ihttos (4etiianacgotonmd slowest to the left/right axis, the pat- 


tern obtained for the prototypical situation. 
When narratives describe observers as reclin- 
ing in the scenes, turning from back to side 
to front, then no axis of the body is corre- 
lated with gravity; thus, times depend on the 
asymmetries of the body, and the pattern 
changes. Times to retrieve objects in front 
and back are then fastest because the per 
ceptual and behavioral asymmetries of the 
front/back axis are most important. This is 
the axis that separates the world that can be 
seen and manipulated from the world that 
cannot be seen or manipulated. 

By now, dozens of experiments have ex- 
amined patterns of response times to system- 
atic changes in the described spatial envi- 
ronment (e.g., Bryant, Tversky, & Franklin, 
1992; Franklin, Tversky, & Coon, 1992). In 
one variant, narratives described participants 
at an oblique angle outside the environ- 
ment looking onto a character (or two!) in- 
side the environment; in that case, none of 
the axes of the observer’s body is corre- 
lated with axes of the characters in the nar- 
rative, and the reaction times to all direc- 
tions are equal (Franklin et al., 1992). In an- 
other variant, narratives described the scene, 
a special space house constructed by NASA, 
as rotating around the observer instead of 
the observer’s turning in the scene (Tver- 
sky, Kim, & Cohen, 1999). That condition 
proved difficult for participants. They took 
twice as long to update the environment 
when the environment moved than when 
the observer moved — a case problematic 
for pure propositional accounts of mental 
spatial transformations. Once participants 
had updated the environment, retrieval 
times corresponded to the spatial frame- 
work pattern. 

Yet other experiments have varied the 
way the environment was conveyed, com- 
paring description, diagram, 3D model, and 
life (Bryant & Tversky, 1999; Bryant, Tver- 
sky, & Lanca, 2001). When the scene is con- 
veyed by narrative, life, or a 3D model, the 
standard spatial framework pattern obtains. 
However, when the scene is conveyed by 
a diagram, participants spontaneously adopt 
an external perspective on the environment. 
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performing a mental rotation of the entire 
environment rather than performing a men- 
tal change of their own perspective with re- 
spect to a surrounding environment (Bryant 
& Tversky, 1999). Which viewpoint partici- 
pants adopt, and consequently which mental 
transformation they perform, can be altered 
by instructions. When instructed to do so, 
participants will adopt the internal perspec- 
tive embedded in the environment in which 
the observer turns from a diagram or the ex- 
ternal perspective from a model in which the 
entire environment is rotated with the pre- 
dicted changes in patterns of retrieval times. 
Similar findings have been reported by 
Huttenlocher and Presson (1979), Wraga, 
Creem, and Proffitt (2000), and Zacks et al. 
(in press). 


ROUTE AND SURVEY PERSPECTIVES 


When people are asked to describe envi- 
ronments that are too large to be seen at a 
glance, they do so from one of two perspec- 
tives (Taylor & Tversky, 1992a, 1996). In a 
route perspective, people address the listener 
as “you,” and take “you” on a tour of the en- 
vironment, describing landmarks relative to 
your current position in terms of your front, 
back, left, and right. In a survey perspective, 
people take a bird’s eye view of the envi- 
ronment and describe locations of landmarks 
relative to one another in terms of north, 
south, east, and west. Speakers (and writers) 
often mix perspectives, contrary to linguists 
who argue that a consistent perspective is 
needed both for coherent construction of 
a message and for coherent comprehen- 
sion (Taylor & Tversky, 1992, 1996; Tversky, 
Lee, & Mainwaring, 1999). In fact, con- 
struction of a mental model is faster when 
perspective is consistent, but the effect is 
small and disappears quickly during retrieval 
from memory (Lee & Tversky, in press). 
In memory for locations and directions of 
landmarks, route and survey statements are 
verified equally quickly and accurately re- 
gardless of the perspective of learning, pro- 
vided the statements are not taken verbatim 
from the text (Taylor & Tversky, 1992b). For 


tion needed to understand the location in- 
formation is a transformation of self an ego- 
centric transformation of one’s viewpoint 
in an environment. For survey perspectives, 
the mental transformation needed to under- 
stand the location information is a transfor- 
mation of other, a kind of mental scanning 
of an object. 

The prevalence of these two perspectives 
in imagery, the external perspective viewing 
an object or something that can be repre- 
sented as an object and the internal perspec- 
tive viewing an environment from within, 
is undoubtedly associated with their preva- 
lence in the experience of living. In life, we 
observe changes in the orientation, size, and 
configuration of objects in the world and 
scan them for those changes. In life, we move 
around in environments, updating our po- 
sition relative to the locations of other ob- 
jects in the environment. We are adept at 
performing the mental equivalents of these 
actual transformations. There is a natural 
correspondence between the internal and 
external perspectives and the mental trans- 
formations of self and other, but the human 
mind is flexible enough to apply either trans- 
formation to either perspective. Although 
we are biased to take an external perspec- 
tive on objects and mentally transform them 
and biased to take an internal perspective on 
environments and mentally transform our 
bodies with respect to them, we can take 
internal perspectives on objects and ex- 
ternal perspectives on events. The mental 
world allows perspectives and transforma- 
tions, whereas the physical world does not. 
Indeed, conceptualizing a 3D environment 
that surrounds us and is too large to be seen 
at once as a small flat object before the eyes, 
something people, even children, have done 
for eons whenever they produce a map, is 
a remarkable feat of the human mind (cf. 
Tversky, 2000). 


EFFECTS OF LANGUAGE ON SPATIAL THINKING 


Speakers of Dutch and other Western lan- 
guages use both route and survey perspec- 
tives. Put differently, they can use either a 
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lute (extrinsic) spatial reference system to 
describe locations of objects in space. Rela- 
tive systems use the spatial relations “left,” 
“right,” “front,” and “back” to locate objects; 
absolute or extrinsic systems use terms 
equivalent to “north,” “south,” “east,” and 
“west.” A smattering of languages dispersed 
around the world do not describe locations 
using “left” and “right” (Levinson, 2003). In- 
stead, they rely on an absolute system, so 
a speaker of those languages would refer to 
your coffee cup as the “north” cup rather 
than the one on “your right.” Talk appar- 
ently affects thought. Years of talking about 
space using an absolute spatial reference sys- 
tem have had fascinating consequences for 
thinking about space. For example, speakers 
of absolute languages reconstruct a shuffled 
array of objects relative to extrinsic direc- 
tions in contrast to speakers of Dutch, who 
reconstruct the array relative to their own 
bodies. What’s more, when speakers of lan- 
guages with only extrinsic reference sys- 
tems are asked to point home after being 
driven hither and thither, they point with 
impressive accuracy, in contrast to Dutch 
speakers, who point at random. The view 
that the way people talk affects how they 
think has naturally aroused controversy (see 
Gleitman & Papafragou, Chap. 26), but is re- 
ceiving increasing support from a variety of 
tasks and languages (e.g., Boroditsky, 2001; 
Boroditsky, Ham, & Ramscar, 2002). If we 
take a broader perspective, the finding that 
language affects thought is not as startling. 
Language is a tool, such as measuring in- 
struments or arithmetic or writing; learn- 
ing to use these tools also has consequences 
for thinking. 


Judgments 


Complex visuospatial thinking is fundamen- 
tal to a broad range of human activity, from 
providing directions to the post office and 
understanding how to operate the latest 
electronic device to predicting the conse- 
quences of chemical bonding or designing a 


ing is fundamental to the reasoning processes 
described in other chapters in this handbook, 
as discussed in the chapters on similarity (see 
Goldstone & Son, Chap. 2), categorization 
(see Medin & Rips, Chap. 3), induction (see 
Sloman & Lagnado, Chap. 5), analogical rea- 
soning (see Holyoak, Chap. 6), causality (see 
Buehner & Cheng, Chap. 7), deductive rea- 
soning (see Evans, Chap. 8), mental models 
(see Johnson-Laird, Chap. 9), and problem 
solving (see Novick & Bassok, Chap. 14). For- 
tunately for both reader and author, there is 
no need to repeat those discussions here. 


Distortions as Clues to Reasoning 


Another approach to revealing visuospa- 
tial reasoning has been to demonstrate the 
ways that visuospatial representations differ 
systematically from situations in the world. 
This approach, which can be called the dis- 
tortions program, contrasts with the classi- 
cal imagery approach. The aim of the distor- 
tions approach is to elucidate the processes 
involved in constructing and using men- 
tal representations by showing their conse- 
quences. The distortions approach has fo- 
cused more on relations between objects 
and relations between objects and refer- 
ence frames, as these visuospatial properties 
seem to require more constructive processes 
than those for establishing representations 
of objects. Some systematic distortions have 
also been demonstrated in representations 
of objects. 


REPRESENTATIONS 


Early on, the Gestalt psychologists at- 
tempted to demonstrate that memory for 
figures got distorted in the direction of good 
figures (see Riley, 1962). This claim was con- 
tested and countered by increasingly sophis- 
ticated empirical demonstrations. The dis- 
pute faded in a resolution: visual stimuli 
are interpreted, sometimes as good figures; 
memory tends toward the interpretations. 
So if o — 0 is interpreted as “eyeglasses,” par- 
ticipants later draw the connection curved, 
whereas if it is interpreted as “barbells,” 
they do not (Carmichael, Hogan, & Walter, 
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not appear in recognition memory (Prentice, 
1954). Since then, and relying on the sophis- 
ticated methods developed, there has been 
more evidence for shape distortion in repre- 
sentations. Shapes that are nearly symmet- 
ric are remembered or judged as more sym- 
metric than they actually are, as if people 
code nearly symmetric objects as symmetric 
(Freyd & Tversky, 1984; McBeath, Schiano, 
& Tversky, 1997; Tversky & Schiano, 1989). 
Given that many of the objects and be- 
ings that we encounter are symmetric, but 
are typically viewed at an oblique angle, 
symmetry may be a reasonable assump- 
tion, although one that is wrong on occa- 
sion. Size is compressed in memory (Kerst 
& Howard, 1978). When portions of ob- 
jects are truncated by picture frames, the 
objects are remembered as more complete 
than they actually were (Intraub, Bender, & 
Mangels, 1992). 


REPRESENTATIONS AND TRANSFORMATIONS: SPATIAL 
CONFIGURATIONS AND COGNITIVE MAPS 

The Gestalt psychologists also produced 
striking demonstrations that people organize 
the visual world in principled ways, even 
when that world is a meaningless array (see 
Hochberg, 1978). Entities in space, espe- 
cially ones devoid of meaning, are difficult 
to understand in isolation but easier to grasp 
in context. People group elements in an array 
by proximity or similarity or good continua- 
tion. One inevitable consequence of percep- 
tual organizing principles is distorted repre- 
sentations. 

Many of the distortions reviewed here 
have been instantiated in memory for per- 
ceptual arrays that do not stand for anything. 
They have also been illustrated in memory 
for cognitive maps and for environments. As 
such, they have implications for how people 
reason in navigating the world, a visuospa- 
tial reasoning task that people of all ages and 
parts of the world need to solve. Even more 
intriguing, many of these phenomena have 
analogs in abstract thought. 

For the myriad spatial distortions de- 
scribed here (and analyzed more fully in 


cult to clearly attribute error to either rep- 
resentations or processes. Rather the errors 
seem to be consequences of both, of schema- 
tized, hence distorted, representations con- 
structed ad hoc in order to enable specific 
judgments, such as the direction or distance 
between pairs of cities. When answering 
such questions, it is unlikely that people con- 
sult a library of “cognitive maps.” Rather, it 
seems that they draw on whatever informa- 
tion they have that seems relevant, organiz- 
ing it for the question at hand. The reliability 
of the errors under varying judgments makes 
it reasonable to assume erroneous represen- 
tations are reliably constructed. Some of the 
organizing principles that yield systematic 
errors are reviewed in the next section. 


Hierarchical Organization. Dots that are 
grouped together by good continuation, for 
example, parts of the same square out- 
lined in dots, are judged to be closer than 
dots that are actually closer but parts of 
separate groups (Coren & Girgus, 1980). 
An analogous phenomenon occurs in judg- 
ments of distance between buildings (Hirtle 
& Jonides, 1985): Residents of Ann Arbor 
think that pairs of university (or town) build- 
ings are closer than actually closer pairs of 
buildings that belong to different groups, one 
to the university and the other to the town. 
Hierarchical organization of essentially flat 
spatial information also affects accuracy and 
time to make judgments of direction. People 
incorrectly report that San Diego is west of 
Reno. Presumably this error occurs because 
people know the states to which the cities 
belong and use the overall directions of the 
states to infer the directions between cities in 
the states (Stevens & Coupe, 1978). People 
are faster to judge whether one city is east or 
north of another when the cities belong to 
separate geographic entities than when they 
are actually farther but part of the same ge- 
ographic entity (Maki, 1981; Wilton, 1979). 

A variant of hierarchical organization 
occurs in locating entities belonging to a 
bounded region. When asked to remember 
the location of a dot in a quadrant, people 
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as if they were using general information 
about the area to locate the entity contained 
in it (Huttenlocher, Hedges, & Duncan, 
1991; Newcombe & Huttenlocher, 2000). 


Amount of Information. That representa- 
tions are constructed on the fly in the ser 
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vice of particular judgments seems to be the 
case for other distance estimates. Distances 
between A and B, say two locations within a 
town, are greater when there are more cross 
streets or more buildings or more obstacles 
or more turns on the route (Newcombe & 
Liben, 1982; Sadalla & Magel, 1980; Sadalla 
& Staplin, 1980a, 1980b; Thorndyke, 1981), 
as if people mentally construct a represen- 
tation of a path from A to B from that in- 
formation and use the amount of informa- 
tion as a surrogate for the missing exact 
distance information. There is an analogous 
visual illusion: A line appears longer if bi- 
sected and longer still with more tick marks 
(at some point of clutter, the illusion ceases 
or reverses). 


Perspective. Steinberg regaled generations 
of readers of the New Yorker and denizens of 
dormitory rooms with his maps of views of 
the world. In the each view, the immediate 
surroundings are stretched and the rest of 
the world shrunk. The psychological reality 
of this genre of visual joke was demonstrated 
by Holyoak and Mah (1982). They asked stu- 
dents in Ann Arbor to imagine themselves 
on either coast and to estimate the distances 
between pairs of cities distributed more or 
less equally on an east-west axis across the 
states. Regardless of imagined perspective, 
students overestimated the near distances 
relative to the far ones. 


Landmarks. Distance judgments are also 
distorted by landmarks. People judge the dis- 
tance of an undistinguished place to be closer 
to a landmark than vice versa (McNamara 
& Diwadkar, 1997; Sadalla, Burroughs, & 
Staplin, 1980). Landmark asymmetries vi- 
olate elementary metric assumptions, as- 
sumptions that are more or less realized in 
real space. 
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Figure 10.4. Alignment. A significant majority 
of participants think the incorrect lower map is 
correct. The map has been altered so the United 
States and Europe and South American and 
Africa are more aligned (after Tversky, 1951). 


Alignment. Hierarchical, perspective, and 
landmark effects can all be regarded as con- 
sequences of the Gestalt principle of group- 
ing. Even groups of two equivalent entities 
can yield distortion. When people are asked 
to judge which of two maps is correct, a map 
of North and South America in which South 
America has been moved westward to over- 
lap more with North America, or the ac- 
tual map, in which the two continents barely 
overlap, the majority of respondents pre- 
fer the former (Tversky, 1981; Figure 10.4). 
A majority of observers also prefer an in- 
correct map of the Americas and Europe/ 
Africa/Asia in which the Americas are 
moved northward so the United States and 
Europe and South America and Africa are 
more directly east-west. This phenomenon 
has been called alignment; it occurs when 
people group two spatial entities and then 
remember them more in correspondence 
than they actually are. It appears not only 
in judgments of maps of the world but also 
in judgments of directions between cities in 
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for visual blobs. 

Spatial entities cannot be localized in 
isolation; they can be localized with re- 
spect to other entities or to frames of ref- 
erence. When they are coded with respect 
to another entity, alignment errors are likely. 
When entities are coded with respect to a 
frame of reference, rotation errors, described 
in the next section, are likely. 


Rotation. When people are asked to place 
a cutout of South America in a north-south 
east-west frame, they upright it. A large spa- 
tial object, such as South America, induces 
its own coordinates along an axis of elon- 
gation and an axis parallel to that one. The 
actual axis of elongation of South America 
is tilted with respect to north-south, and 
people upright it in memory. Similarly, peo- 
ple incorrectly report that Berkeley is east 
of Stanford when it is actually slightly west. 
Presumably this occurs because they up- 
right the Bay Area, which actually runs 
at an angle with respect to north-south. 
This error has been called rotation; it oc- 
curs when people code a spatial entity with 
respect to a frame of reference (Tversky, 
1981; Figure 10.5). As for rotation, it ap- 
pears in memory for artificial maps and un- 
interpreted blobs, as well as in memory for 
real environments. Others have replicated 
this error in remembered directions and in 
navigation (e.g., Glicksohn, 1994; Lloyd & 
Heivly, 1987; Montello, 1991; Presson & 
Montello, 1994). 


Are Spatial Representations Incoherent? 
This brief review has brought evidence for 
distortions in memory and judgment for 
shapes of objects, configurations of objects, 
and distances and directions between objects 
that are a consequence of the organization 
of the visuospatial information. These are 
not errors of lack of knowledge; even ex- 
perienced taxi drivers make them (Chase 
& Chi, 1981). Moreover, many of these bi- 
ases have parallels in abstract domains, such 
as judgments about members of one’s own 
social or political groups relative to judg- 
ments about members of other groups (e.g., 
Quattrone, 1956). 


tures all these distortions look like? It would 
look like nothing that can be sketched on a 
sheet of paper, that is, is coherent in two di- 
mensions. Landmark asymmetries alone dis- 
allow that. It does not seem likely that peo- 
ple make these judgments by retrieving a 
coherent prestored mental representation, a 
“cognitive map,” and reading the direction or 
distance from it. Rather, it seems that people 
construct representations on the fly, incorpo- 
rating only the information needed for that 
judgment, the relevant region, the specific 
entities within it. Some of the information 
may be visuospatial from experience or from 
maps; some may be linguistic. For these rea- 
sons, “cognitive collage” seems a more apt 
metaphor than “cognitive map” for what- 
ever representations underlie spatial judg- 
ment and memory (Tversky, 1993). Such 
representations are schematic; they leave 
out much information and simplify others. 
Schematization occurs for at least two rea- 
sons. More exact information may not be 
known and therefore cannot be represented. 
More exact information may not even be 
needed because the situation on the ground 
may fill it in. More information may over- 
load working memory, which is notoriously 
limited. Not only must the representation be 
constructed in working memory, but a judg- 
ment must also be made on the representa- 
tion. Schematization may hide incoherence, 
or it may not be noticed. Schematization 
necessarily entails systematic error. 


Why do Errors Persist? It is reasonable to 
wonder why so many systematic errors per- 
sist. Some reasons for the persistence of er- 
ror have already been discussed — that there 
may be correctives on the ground, that some 
errors are a consequence of the schematiza- 
tion processes that are an inherent part of 
memory and information processing. Yet an- 
other reason is that the correctives are spe- 
cific — now I know that Rome is north of 
Philadelphia — and do not affect or even 
make contact with the general information 
organizing principle that generated the error 
and that serves us well in many situations 
(e.g., Tversky, 2003 a). 
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Figure 10.5. Rotation. When asked to place a cutout of South America in a 


NSEW framework, most participants upright it, as in the left example (after 
Tversky, 1981). 


From Spatial to Abstract Reasoning 


Visuospatial reasoning does not only entail 
visuospatial transformations on visuospatial 
information. Visuospatial reasoning also in- 
cludes making inferences from visuospatial 
information, whether that information is in 
the mind or in the world. An early demon- 
stration was the symbolic distance effect (e.g., 
Banks & Flora, 1977; Moyer, 1973; Paivio, 
1978). The time to judge which of two ani- 
mals is more intelligent or pleasant is faster 
when the entities are farther on the dimen- 
sion than when they are closer — as if people 
were imagining the entities arrayed on a line 
corresponding to the abstract dimension. It 
is easier, hence faster, to discriminate larger 
distances than smaller ones. Note that a sub- 
jective experience of creating and using an 
image does not necessarily accompany mak- 
ing these and other spatial and abstract judg- 
ments. Spatial thinking can occur regardless 
of whether thinkers have the sensation of 
using an image. So many abstract concepts 
have spatial analogs (for related discussion, 
see Holyoak, Chap. 6). 
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Indeed, spatial reasoning is often studied 
in the context of graphics, maps, diagrams, 
graphs, and charts. External representations 
bear similarities to internal representations if 
only because they are creations of the human 
mind that is cognitive tools to increase the 
power of the human mind. They also bear 
formal similarities in that both internal and 
external representations are mappings be- 
tween elements and relations. External rep- 
resentations are constrained by a medium 
and unconstrained by working memory; for 
this reason, inconsistencies, ambiguities, and 
incompleteness may be reduced in external 
representations. 


Graphics: Elements 


The readiness with which people map ab- 
stract information onto spatial information 
is part of the reason for the widespread use 
of diagrams to represent and convey ab- 
stract information from the sublime — the 
harmonies of the spheres rampant in re- 
ligions spanning the globe — to the mun- 
dane corporate charts and statistical graphs. 
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and spatial relations among the elements. In 
contrast to written (alphabetic) languages, 
both elements and use of space in graph- 
ics can convey meaning rather directly (eg., 
Bertin, 1967/1983; Pinker, 1994; Tversky, 
1995, 2001; Winn, 1989). Elements may 
consist of likenesses, such as road signs de- 
picting picnic tables, falling rocks, or deer. El- 
ements may also be figures of depiction, sim- 
ilar to figures of speech: synecdoche, where 
a part represents a whole, common in ideo- 
graphic writing, for example, using a ram’s 
horns to represent a ram; or metonomy, 
where an association represents an entity 
or action, which is common in computer 
menus, such as scissors to denote cut text 
or a trashcan to allow deletion of files. 


Graphics: Relations 


Relations among entities preserve different 
levels of information. The information pre- 
served is reflected in the mapping to space. In 
some cases, the information preserved is sim- 
ply categorical; space is used to separate en- 
tities belonging to different categories. The 
spaces between words, for example, indi- 
cate that one set of letters belongs to one 
meaning and another set to another mean- 
ing. Space can also be used to represent ordi- 
nal information, for example, listing historic 
events in their order of occurrence, groceries 
by the order of encountering them in the 
supermarket, and companies by their prof- 
its. Space can be used to represent interval 
or ratio information, as in many statistical 
graphs, where the spatial distances among 
entities reflect their distances on some 
other dimension. 


SPONTANEOUS USE OF SPACE TO REPRESENT 
ABSTRACT RELATIONS 

Even preschool children spontaneously use 
diagrammatic space to represent abstract in- 
formation (e.g., diSessa, Hammer, Sherin, & 
Kolpakowski, 1991; Tversky, Kugelmass, & 
Winter, 1991). In one set of studies (Tver 
sky et al., 1991), children from three lan- 
guage communities were asked to place 
stickers on paper to represent spatial, tem- 


tion, for example, to place stickers for TV 
shows they loved, liked, or disliked. Almost 
all the preschoolers put the stickers on a line, 
preserving ordinal information. Children in 
the middle school years were able to repre- 
sent interval information, but representing 
more than ordinal information was unusual 
for younger children, despite strong manipu- 
lations to encourage them. Not only did chil- 
dren (and adults) spontaneously use spatial 
relations to represent abstract relations, but 
children also showed preferences for the di- 
rection of increases in abstract dimensions. 
Increases were represented from right to left 
or left to right (irrespective of direction of 
writing for quantity and preference) or down 
to up. Representing increasing time or quan- 
tity from up to down was avoided. Rep- 
resenting increases as upward is especially 
robust; it affects people’s ability to make 
inferences about second-order phenomena 
such as rate, which is spontaneously mapped 
to slope, from graphs (Gattis, 2002; Gattis & 
Holyoak, 1996). The correspondence of up- 
ward to more, better, and stronger appears 
in language — on top of the world, rising to 
higher levels of platitude — and in gesture — 
thumbs up, high five — as well as in graph- 
ics. These spontaneous and widespread cor- 
respondences between spatial and abstract 
relations suggest they are cognitively natural 
(e.g., Tversky, 1995a, 2001). 

The demonstrations of spontaneous use 
of spatial language and diagrammatic space 
to represent abstract relations suggests that 
spatial reasoning forms a foundation for 
more abstract reasoning. In fact, children 
used diagrammatic space to represent ab- 
stract relations earlier for temporal relations 
than for quantitative ones, and earlier for 
quantitative relations than for preference re- 
lations (Tversky et al., 1991). Corrobora- 
tive evidence comes from simple spatial and 
temporal reasoning tasks, such as judging 
whether one object or person is before an- 
other. In many languages, words for spatial 
and temporal relations, such as before, after, 
and in between, are shared. That spatial terms 
are the foundation for the temporal comes 
from research showing priming of temporal 
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vice versa (Boroditsky, 2000). More support 
for the primacy of spatial thinking for ab- 
stract thought comes from studies of prob- 
lem solving (Carroll, Thomas, & Mulhotra, 
1980). One group of participants was asked 
to solve a spatial problem under constraints, 
arranging offices to facilitate communica- 
tion among key people. Another group was 
asked to solve a temporal analog, arranging 
processes to facilitate production. The solu- 
tions to the spatial analog were superior to 
those to the temporal analog. When exper- 
imenters suggested using a diagram to yet 
another group solving the temporal analog, 
their success equaled that of the spatial ana- 
log group. 


DIAGRAMS FACILITATE REASONING 


Demonstrating that using a spatial dia- 
gram facilitates temporal problem solving 
also illustrates the efficacy of diagrams in 
thinking — a finding amply supported, even 
for inferences entailing complex logic, such 
as double disjunctions, although to succeed, 
diagrams have to be designed with attention 
to the ways that space and spatial entities are 
used to make inferences (Bauer & Johnson- 
Laird, 1993). Middle school children study- 
ing science were asked to put reminders 
on paper. Those children who sketched dia- 
grams learned the material better than those 
who did not (Rode & Stern, in press). 


DIAGRAMS FOR COMMUNICATING 


Many maps, charts, diagrams, and graphs are 
meant to communicate clearly for travel- 
ers, students, and scholars, whether they are 
professionals or amateurs. To that end, they 
are designed to be clear and easy to com- 
prehend, and they meet with varying suc- 
cess. Good design takes account of human 
perceptual and cognitive skills, biases, and 
propensities. Even ancient Greek vases take 
account of how they will be seen. Because 
they are curved round structures, creating 
a veridical appearance requires artistry. The 
vase “Achilles and Ajax playing a game” by 
the Kleophrades Painter in the Museum of 
Metropolitan Art in New York City (Art. 


that appears in one piece from the desired 
viewing angle, but in three pieces when 
viewed straight on (J. P. Small, personal com- 
munication, May 27, 2003). 

The perceptual and cognitive processes 
and biases that people bring to graphics in- 
clude the catalog of mental representations 
and transformations that was begun earlier. 
In that spirit, several researchers have devel- 
oped models for graph understanding, no- 
tably Pinker (1990), Kosslyn (1989, 1994), 
and Carpenter and Shah (1998) (see Shah 
2003/2004, for an overview). These mod- 
els take account of the particular perceptual 
or imaginal processes that need to be ap- 
plied to particular kinds of graphs to yield 
the right inferences. Others have taken ac- 
count of perceptual and cognitive processing 
in the construction of guidelines for design. 
(e.g., Carswell & Wickens, 1990; Cleveland, 
1985; Kosslyn, 1994a; Tufte, 1983, 1990, 
1997; Wainer, 1984, 1997). In some cases the 
design principles are informed by research, 
but in most they are informed by the au- 
thors’ educated sensibilities and/or rules of 
thumb from graphic design. 


Inferences from Diagrams: Structural and 
Functional. The existence of spontaneous 
mapping of abstract information onto spatial 
does not mean that the meanings of diagrams 
are transparent and can be automatically and 
easily extracted (e.g., Scaife & Rogers, 1995). 
Diagrams can support many different classes 
of inferences, notably, structural and func- 
tional (e.g., Mayer & Gallini, 1990). Struc- 
tural inferences, or inferences about quali- 
ties of parts and the relations among them, 
can be readily made from inspection of a di- 
agram. Distance, direction, size, and other 
spatial qualities and properties can be “read 
off’ a diagram (Larkin & Simon, 1987), at 
least with some degree of accuracy. “Reading 
off” entails using the sort of mental trans- 
formations discussed earlier, mental scan- 
ning, mental distance, size, shape, or direc- 
tion judgments or comparisons. Functional 
inferences, or inferences about the behav- 
ior of entities, cannot be readily made from 
inspection of a diagram in the absence of 


228 THE CAMBRIDGE HANDBOOK OF THINKING AND REASONING 


additional kuitratehte 6 sbays tnttss Haistilianacyckgend on the order of that pulley in the 


are often a consequence of expertise. Spa- 
tial information may provide clues to func- 
tional information, but it is not sufficient 
for concepts such as force, mass, and fric- 
tion. Making functional inferences requires 
linking perceptual information to concep- 
tual information; it entails both knowing 
how to “read” a diagram, that is, what vi- 
suospatial features and relations to inspect 
or transform, and knowing how to interpret 
that visuospatial information. 

Structural and functional inferences re- 
spectively correspond to two senses of men- 
tal model prevalent in the field. In both cases, 
mental model contrasts with image. In one 
sense, a mental model contrasts with an im- 
age in being more skeletal or abstract. This is 
the sense used by Johnson-Laird in his book, 
Mental Models (1983), in his explication of 
how people solve syllogisms (see Johnson- 
Laird, Chap. 9, and Evans, Chap. 8). Here, 
a mental model captures the structural re- 
lations among the parts of a system. In the 
other sense, a mental model contrasts with 
an image in having moving parts, in being 
“runnable” to derive functional or causal in- 
ferences (for related discussion on causal- 
ity, see Buehner and Cheng, Chap. 7, and 
on problem solving, see Chi and Ohlsson, 
Chap. 16). This is the sense used in another 
book also titled Mental Models (Gentner & 
Stevens, 1983). One goal of diagrams is to 
instill mental models in the minds of their 
users. To that end, diagrams abstract the es- 
sential elements and relations of the system 
they are meant to convey. As is seen, convey- 
ing structure is more straightforward than 
conveying function. 

What does it mean to say that a mental 
model is “runnable?” One example comes 
from research on pulley systems (Hegarty, 
1992). Participants were timed to make two 
kinds of judgments from diagrams of three- 
pulley systems. For true-false judgments of 
structural questions, such as “The upper left 
pulley is attached to the ceiling,” response 
times did not depend on which pulley in 
the system was queried. For judgments of 
functional questions, such as “The upper left 
pulley goes clockwise,” response times did 


mechanics of the system. To answer func- 
tional questions, it is as if participants men- 
tally animate the pulley system in order to 
generate an answer. Mental animation, how- 
ever, does not seem to be a continuous pro- 
cess in the same way as physical animation. 
Rather, mental animation seems to be a se- 
quence of discrete steps — for example, the 
first pulley goes clockwise, and the rope goes 
under the next pulley to the left of it, so it 
must go counterclockwise. That continuous 
events are comprehended as sequences of 
steps is corroborated by research on segmen- 
tation and interpretation of everyday events, 
such as making a bed (Zacks, Tversky, & 
Iyer, 2001). 

It has long been known that domain ex- 
perts are more adept at functional inferences 
from diagrams than novices. Experts can 
“see” sequences of organized chess moves 
in a midgame display (Chase & Simon, 
1973; De Groot, 1965). Similarly, experts 
in Go (Reitman, 1976), electricity (Egan 
& Schwartz, 1979), weather (Lowe, 1989), 
architecture (Suwa & Tversky, 1997), and 
more make functional inferences with ease 
from diagrams in their domain. Novices 
are no different from experts in structural 
inferences. 


Inferences from Diagrams of Systems. The 
distinction between structural and func- 
tional inferences is illustrated by work on 
production and comprehension of diagrams 
for mechanical systems, such as a car brake, 
a bicycle pump, or a pulley system (Heiser 
& Tversky, 2002; Figure 10.6). Participants 
were asked to interpret a diagram of one of 
the systems. On the whole, their interpreta- 
tions were structural, that is, they described 
the relations among the parts of the system. 
Another set of participants was given the 
same diagrams enriched by arrows indicat- 
ing the sequence of action in the systems. 
Those participants gave functional descrip- 
tions; that is, they described the step-by-step 
operation of the system. Reversing the tasks, 
other groups of participants read structural 
or functional descriptions of the systems 
and produced diagrams of them. Those who 
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Figure 10.6. Diagrams of a car brake and a bicycle pump (both after Mayer & Gallini, i990), and a 
pulley system (after Hegarty, 1992). Diagrams without arrows encouraged structural descriptions and 
diagrams with arrows yielded functional descriptions (Heiser and Tversky, in press). 


read functional descriptions used arrows in 
their diagrams far more than those who read 
structural descriptions. Arrows are an ex- 
trapictorial device that have many meanings 
and functions in diagrams, such as point- 
ing, indicating temporal sequence, causal se- 
quence, and path and manner of motion 
(Tversky, 2001). 

Expertise came into play in a study of 
learning rather than interpretation. Partic- 
ipants learned one of the mechanical sys- 
tems from a diagram with or without ar- 
rows or from structural or functional text. 
They were later tested on both structural and 
functional information. Participants high in 
expertise/ability (self-assessed) were able to 
infer both structural and functional infor 
mation from either diagram. In contrast, 
participants low in expertise/ability could 
derive structural but not functional informa- 
tion from the diagrams. Those participants 


were able to infer functional information 
from functional text. This finding suggests 
that people with high expertise/ability can 
form unitary diagrammatic mental models 
of mechanical systems that allow spatial and 
functional inferences with relative ease, but 
people with low expertise/ability have and 
use diagrammatic mental models for struc- 
tural information but rely on propositional 
representations for functional information. 


Enriching Diagrams to Facilitate Functional 
Inferences. As noted, conveying spatial or 
structural information is relatively straight- 
forward in diagrams. Diagrams can use space 
to represent space in direct ways that are 
readily interpreted, as in maps and archi- 
tectural sketches. Conveying information 
that is not strictly spatial, such as change 
over time, forces, and kinematics, is less 
straightforward. Some visual conventions for 
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forces have been developed in comics and 
in diagrams (e.g., Horn, 1998; Kunzle, 1990; 
McCloud, 1994), and many of these con- 
ventions are cognitively compelling. Arrows 
are a good example. As lines, arrows in- 
dicate a relationship, a link. As asymmet- 
ric lines, they indicate an asymmetric rela- 
tionship. The arrowhead is compelling as an 
indicator of the direction of the asymme- 
try because of its correspondence to arrow- 
heads common as weapons in the world or 
its correspondence to Vs created by paths 
of downward moving water. A survey of 
diagrams in science and engineering texts 
shows wide use of extrapictorial diagram- 
matic devices, such as arrows, lines, brack- 
ets, and insets, although not always consis- 
tently (Tversky, Heiser, Lozano, MacKenzie, 
& Morrison, in press). As a consequence, 
these devices are not always correctly in- 
terpreted. Some diagrams of paradigmatic 
processes, such as the nitrogen cycle in bi- 
ology or the rock cycle in geology, contain 
the same device, typically an arrow, with 
multiple senses, pointing or labeling, indi- 
cating movement path or manner, suggest- 
ing forces or sequence, in the same diagram. 
Of course, there is ambiguity in many words 
that appear commonly in scientific and other 
prose, words that parallel these graphic de- 
vices, such as line and relationship. Neverthe- 
less, the confusion caused by multiple senses 
of diagrammatic devices in interpreting di- 
agrams suggests that greater care in design 
is worthwhile. 

An intuitive way to visualize change over 
time is by animations. After all, an animation 
uses change over time to convey change over 
time, a cognitively compelling correspon- 
dence. Despite the intuitive appeal, a sur- 
vey of dozens of studies that have compared 
animated graphics to informationally com- 
parable static graphics in teaching a wide 
variety of concepts, physical, mechanical, 
and abstract, did not find a single example 
of superior learning by animations (Tversky, 
Morrison, & Betrancourt, 2002). Animations 
may be superior for purposes other than 
learning, for example, in maintaining per- 
spective or in calling attention to a solution 


containing many arrows moving toward the 
center of a display was superior to a diagram 
with static arrows in suggesting the solution 
to the Duncker radiation problem of how to 
destroy a tumor without destroying healthy 
tissue (Pedone, Hummel, & Holyoak, 2001; 
see Holyoak, Chap. 6, Figure 6.4). The fail- 
ure of animations to improve learning itself 
becomes intuitive on further reflection. For 
one thing, animations are often complex, so 
it is difficult for a viewer to know where to 
look and to make sense of the timing of many 
moving components. However, even simple 
animations, such as the path of a single mov- 
ing circle, are not superior to static graphics 
(Morrison & Tversky, in press). The second 
reason for the lack of success of anima- 
tions is one reviewed earlier. If people think 
of dynamic events as sequences of steps 
rather than continuous animations, then 
presenting change over time as sequences 
of steps may make the changes easier 
to comprehend. 


Diagrams for Insight 


Maps for highways and subways, diagrams 
for assembly and biology, graphs for eco- 
nomics and statistics, and plans for electri- 
cians and plumbers are designed to be con- 
cise and unambiguous, although they may 
not always succeed. Their inventors want to 
communicate clearly and without error. In 
contrast are graphics created to be ambigu- 
ous, to allow reinterpretation and discovery. 
Art falls into both those categories. Early de- 
sign sketches are meant to be ambiguous, to 
commit the designer to only those aspects 
of the design that are likely not to change, 
and to leave open other aspects. One reason 
for this is fixation; it is hard to “think out 
of the box.” Visual displays express, suggest, 
more than what they display. That expres- 
sion, in fact, came from solution attempts to 
the famous nine-dot problem (see Novick 
& Bassok, Chap. 14, Fig. 14.4). Connect all 
nine dots in a 3 x 3 array using four straight 
lines without lifting the pen from the pa- 
per. The solution that is hard to see is to 
extend the lines beyond the “box” suggested 
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Figure 10.7. A sketch by an architect designing a museum. Upon 
reinspection, he made an unintentional discovery (Suwa, Tversky, Gero, & 


Purcell, 2001). 


by the 3 x 3 array. The Gestalt psychologists 
made us aware of the visual inferences the 
mind makes without reflection, grouping by 
proximity, similarity, good continuation, and 
common fate. 


INFERENCES FROM SKETCHES 


Initial design sketches are meant to be am- 
biguous for several reasons. In early stages of 
design, designers often do not want to com- 
mit to the details of a solution, only the gen- 
eral outline, leaving open many possibilities; 
gradually, they will fill in the details. Per- 
haps more important, skilled designers are 
able to get new ideas by reexamining their 
own sketches, by having a conversation with 
their sketches, bouncing ideas off them (e.g., 
Goldschmidt, 1994; Schon, 1983; Suwa 
& Tversky, 1997; Suwa, Tversky, Gero, & 
Purcell, 2001). They may construct sketches 
with one set of ideas in mind, but on later 
reexamination they see new configurations 
and relations that generate new design ideas. 
The productive cycle between reexamining 
and reinterpreting is revealed in the protocol 
of one expert architect. When he saw a new 


configuration in his own design, he was more 
likely to invent a new design idea; similarly, 
when he invented a new design idea, he was 
more likely to see a new configuration in his 
sketch (Suwa et al., 2001; Figure 10.7). 

Underlying these unintended discoveries 
in sketches is a cognitive skill termed con- 
structive perception, which consists of two 
independent processes: a perceptual one, 
mentally reorganizing the sketch, and a con- 
ceptual one, relating the new organization 
to some design purpose (Suwa & Tversky, 
2003). Participants adept at generating mul- 
tiple interpretations of ambiguous sketches 
excelled at the perceptual ability of finding 
hidden figures and at the cognitive ability of 
finding remote meaningful associations, yet 
these two abilities were uncorrelated. 

Expertise affects the kinds of inferences 
designers are able to make from their 
sketches. Novice designers are adept at per- 
ceptual inferences, such as seeing proxim- 
ity and similarity relations. Expert design- 
ers are also adept at functional inferences, 
such as “seeing” the flow of traffic or the 
changes in light from sketches (Suwa & 
Tversky, 1997). 


232 THE CAMBRIDGE HANDBOOK OF THINKING AND REASONING 


Conclusion? re Rnaky Dittos i isiiianacyx0knowledgments 


Starting with the elements of visuospatial 
representations in the mind, we end with 
visuospatial representations created by the 
mind. Like language, graphics serve to ex- 
press and clarify individual spatial and ab- 
stract concepts. Graphics have an advantage 
over language in expressiveness (Stenning 
& Oberlander, 1995); graphics use elements 
and relations in graphic space to convey el- 
ements and relations in real or metaphoric 
space. As such, they allow inference based 
on the visuospatial processing that people 
have become expert in as a part of their 
everyday interactions with space (Larkin & 
Simon, 1997). As cognitive tools, graphics 
facilitate reasoning, both by externalizing, 
thus offloading memory and processing, and 
by mapping abstract reasoning onto spatial 
comparisons and transformations. Graphics 
organize and schematize spatial and abstract 
information to highlight and focus the es- 
sential information. Like language, graphics 
serve to convey spatial and abstract concepts 
to others. They make private thoughts pub- 
lic to a community that can then use and 
revise those concepts collaboratively. 

Of course, graphics and physical and men- 
tal transformations on them are not identi- 
cal to visuospatial representations and rea- 
soning; they are an expression of it. Talk 
about space and actions in it were probably 
among the first uses of language, telling oth- 
ers how to find their way and what to look for 
when they get there. Cognitive tools to pro- 
mote visuospatial reasoning were among the 
first to be invented from tokens for property 
counts, believed to be the precursor of writ- 
ten language (Schmandt-Besserat, 1992), to 
trail markers to maps in the sand. Spatial 
thought, spatial language, and spatial graph- 
ics reflect the importance and prevalence 
of visuospatial reasoning in our lives, from 
knowing how to get home to knowing how 
to design a house, from explaining how to 
find the freeway to explaining how the judi- 
cial system works, from understanding basic 
science to inventing new conceptions of the 
origins of the universe. Where do we go from 
here? Onward and upward! 


I am grateful to Phil Johnson-Laird and 
Jeff Zacks for insightful suggestions on a 
previous draft. Preparation of this chapter 
and some of the research reported were 
supported by Office of Naval Research, 
Grant Numbers NOOO14-PP-1-O649, 
Nooo140110717, and Nooo140210534 to 
Stanford University. 
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CHAPTER 11 


Decision Making 


Robyn A. LeBoeuf 
Eldar B. Shafir 


Introduction 


People make countless decisions every day, 
ranging from ones that are barely noticed 
and soon forgotten (“What should I drink 
with lunch?” “What should I watch on 
TV?”), to others that are highly consequen- 
tial (“How should I invest my retirement 
funds?” “Should I marry this person?”). In 
addition to having practical significance, de- 
cision making plays a central role in many 
academic disciplines: Virtually all the social 
sciences — including psychology, sociology, 
economics, political science, and law — rely 
on models of decision-making behavior. This 
combination of practical and scholarly fac- 
tors has motivated great interest in how de- 
cisions are and should be made. Although 
decisions can differ dramatically in scope 
and content, research has uncovered sub- 
stantial and systematic regularities in how 
people make decisions and has led to the 
formulation of general psychological prin- 
ciples that characterize decision-making be- 
havior. This chapter provides a selective re- 
view of those regularities and principles. 


(For further reviews and edited collections, 
see, among others, Hastie & Dawes, 2001; 
Goldstein & Hogarth, 1997; Kahneman & 
Tversky, 2000.) 

The classical treatment of decision mak- 
ing, known as the “rational theory of choice” 
or the “standard economic model,” posits 
that people have orderly preferences that 
obey a few simple and intuitive axioms. 
When faced with a choice problem, deci- 
sion makers are assumed to gauge each al- 
ternative’s “subjective utility” and to choose 
the alternative with the highest. In the face 
of uncertainty about whether outcomes will 
obtain, decision makers are believed to cal- 
culate an option’s subjective expected utility, 
which is the sum of its subjective utilities 
over all possible outcomes weighted by these 
outcomes’ estimated probabilities of occur- 
rence. Deciding then is simply a matter of 
choosing the option with the greatest ex- 
pected utility; indeed, choice is believed to 
reveal a person's subjective utility functions 
and, hence, his or her underlying preferences 
(e.g., Keeney & Raiffa, 1976; Savage, 1954; 
von Neumann & Morgenstern, 1944). 
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the standard view has met with persistent 
critiques addressing its inadequacy as a de- 
scription of how decisions are actually made. 
For example, Simon (1955) suggested re- 
placing the rational model with a framework 
that accounted for a variety of human re- 
source constraints, such as bounded atten- 
tion and memory capacity, as well as limited 
time. According to this bounded rationality 
view, it was unreasonable to expect decision 
makers to exhaustively compute options’ ex- 
pected utilities. 

Other critiques have focused on system- 
atic violations of even the most funda- 
mental requirements of the rational the- 
ory of choice. According to the theory, for 
example, preferences should remain unaf- 
fected by logically inconsequential factors 
such as the precise manner in which op- 
tions are described, or the specific proce- 
dure used to elicit preferences (Arrow, 1951, 
1988; Tversky & Kahneman, 1986). How- 
ever, compelling demonstrations emerged 
showing that choices failed to obey sim- 
ple consistency requirements and were, in- 
stead, affected by nuances of the decision 
context that were not subsumed by the nor- 
mative accounts (e.g., Lichtenstein & Slovic, 
1971, 1973; Tversky & Kahneman, 1981). In 
particular, preferences appeared to be con- 
structed, not merely revealed, in the making 
of decisions (Slovic, 1995), and this, in turn, 
was shown to lead to significant and system- 
atic departures from normative predictions. 

The mounting evidence has forced a clear 
division between normative and descriptive 
treatments. The rational model remains the 
normative standard against which decisions 
are often judged, both by experts and by 
novices (cf. Stanovich, 1999). At the same 
time, substantial multidisciplinary research 
has made considerable progress in develop- 
ing models of choice that are descriptively 
more faithful. Descriptive accounts as ele- 
gant and comprehensive as the normative 
model are not yet (and may never be) avail- 
able, but research has uncovered robust prin- 
ciples that play a central role in the mak- 
ing of decisions. In what follows, we review 
some of these principles, and we consider 


with normative expectations. 


Choice Under Uncertainty 


In the context of some decisions, the avail- 
ability of options is essentially certain (as 
when choosing items from a menu or cars 
at a dealer’s lot). Other decisions are made 
under uncertainty: They are “risky” when 
the probabilities of the outcomes are known 
(e.g., gambling or insurance) or, as with most 
real world decisions, they are “ambiguous,” 
in that precise likelihoods are not known and 
must be estimated by the decision maker. 
When deciding under uncertainty, a person 
must consider both the desirability of the po- 
tential outcomes and their likelihoods; much 
research has addressed the manner in which 
these factors are estimated and combined. 


Prospect Theory 


When facing a choice between a risky 
prospect that offers a 50% chance to win 
$200 (and a 50% chance to win nothing) ver- 
sus an alternative of receiving $100 for sure, 
most people prefer the sure gain over the 
gamble, although the two prospects have the 
same expected value. (The expected value 
is the sum of possible outcomes weighted 
by their probabilities of occurrence. The ex- 
pected value of the gamble above is .50 * 
$200 + .50 * o = $100.) Such preference 
for a sure outcome over a risky prospect of 
equal expected value is called risk aversion; 
people tend to be risk averse when choos- 
ing between prospects with positive out- 
comes. The tendency toward risk aversion 
can be explained by the notion of dimin- 
ishing sensitivity first formalized by Daniel 
Bernoulli (1738/1954). Bernoulli proposed 
that preferences are better described by ex- 
pected utility than by expected value and 
suggested that “the utility resulting from a 
fixed small increase in wealth will be in- 
versely proportional to the quantity of goods 
previously possessed,” thus effectively pre- 
dicting a concave utility function (a func- 
tion is concave if a line joining two points 
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Figure 11.1. A concave function for gains. 


on the curve lies below the curve). The ex- 
pected utility of a gamble offering a 50% 
chance to win $200 (and 50% nothing) is 
.50 * u($200), where uw is the person’s utility 
function (u(o) = o). As illustrated in Fig- 
ure 11.1, diminishing sensitivity and a con- 
cave utility function imply that the subjec- 
tive value attached to a gain of $100 is more 
than one-half of the value attached to a gain 
of $200 (u(100) > .5*u(200)), which entails 
preference for the sure $100 gain and, hence, 
risk aversion. 

However, when asked to choose between 
a prospect that offers a 50% chance to lose 
$200 (and a 50% chance of nothing) versus 
losing $100 for sure, most people prefer the 
risky gamble over the certain loss. This is be- 
cause diminishing sensitivity applies to nega- 
tive as well as to positive outcomes: The im- 
pact of an initial $100 loss is greater than that 
of an additional $100, which implies a con- 
vex value function for losses. The expected 
utility of a gamble offering a 50% chance to 
lose $200 is thus greater (i.e., less negative) 
than that of a sure $100 loss: (.50*u(—$200) 
> u(—$100)). Such preference for a risky 
prospect over a sure outcome of equal ex- 
pected value is described as risk seeking. With 
the exception of prospects that involve very 
small probabilities, risk aversion is generally 
observed in choices involving gains, whereas 
risk seeking tends to hold in choices involv- 
ing losses. 

These insights led to the S-shaped value 
function that forms the basis for prospect 
theory (Kahneman & Tversky, 1979; Tversky 


scriptive theory of choice. The value func- 
tion of prospect theory, illustrated in Fig- 
ure 11.2, has three important properties: (1) 
it is defined on gains and losses rather than 
total wealth, capturing the fact that peo- 
ple normally treat outcomes as departures 
from a current reference point (rather than 
in terms of final assets, as posited by the ra- 
tional theory of choice); (2) it is steeper for 
losses than for gains, thus, a loss of $X is 
more aversive than a gain of $X is attractive, 
capturing the phenomenon of loss aversion; 
and (3) it is concave for gains and convex for 
losses, predicting, as described previously, 
risk aversion in the domain of gains and risk 
seeking in the domain of losses. 

In addition, according to prospect the- 
ory, probabilities are not treated linearly; 
instead, people tend to overweight small 
probabilities and to underweight large ones 
(Gonzalez & Wu, 1999; Kahneman & 
Tversky, 1979; Prelec, 2000). This, among 
other things, has implications for the attrac- 
tiveness of gambling and of insurance (which 
typically involve low-probability events), 
and it yields substantial discontinuities at the 
endpoints, where the passage from impos- 
sibility to possibility and from high likeli- 
hood to certainty can have inordinate im- 
pact (Camerer, 1992; Kahneman & Tversky, 
1979). Furthermore, research has suggested 
that the weighting of probabilities can be 
influenced by factors such as the decision 
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Figure 11.2. Prospect theory’s value function. 
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(Heath & Tversky, 1991), or by the level of 
affect engulfing the options under consid- 
eration (Rottenstreich & Hsee, 2001). Such 
attitudes toward value and chance entail 
substantial sensitivity to contextual factors 
when making decisions, as discussed further 
in the next section. 


The Framing of Risky Decisions 


The previously described attitudes toward 
risky decisions appear relatively straightfor- 
ward, and yet, they yield choice patterns that 
conflict with normative standards. Perhaps 
the most fundamental are “framing effects” 
(Tversky & Kahneman, 1981, 1986): Because 
risk attitudes differ when outcomes are seen 
as gains as opposed to losses, the same deci- 
sion can be framed to elicit conflicting risk 
attitudes. In one example, respondents were 
asked to assume themselves $300 richer and 
to choose between a sure gain of $100 or an 
equal chance to win $200 or nothing. Alter- 
natively, they were asked to assume them- 
selves $500 richer and to choose between 
a sure loss of $100 and an equal chance 
to lose $200 or nothing. The two prob- 
lems are identical in terms of final assets: 
Both amount to a choice between $400 for 
sure versus an even chance at $300 or $500 
(Tversky & Kahneman, 1986). People, how- 
ever, tend to “accept” the provided frame 
and consider the problem as presented, fail- 
ing to reframe it from alternate perspectives. 
As a result, most people choosing between 
“gains” show a risk-averse preference for 
the certain ($400) outcome, whereas most 
of those choosing between “losses” express 
a risk-seeking preference for the gamble. 
This pattern violates the normative require- 
ment of “description invariance,” according 
to which logically equivalent descriptions of 
a decision problem should yield the same 
preferences (see Kithberger, 1995; Levin, 
Schneider, & Gaeth, 1998, for reviews). 
The acceptance of the problem frame, 
combined with the nonlinear weighting of 
probabilities and, in particular, with the el- 
evated impact of perceived “certainty,” has 
a variety of normatively troubling conse- 


lowing choice between gambles (Tversky & 
Kahneman, 1981, p. 455): 


A. A 25% chance to win $30 
B. A 20% chance to win $45 


Faced with this choice, the majority (58%) 
of participants preferred option B. Now, con- 
sider the following extensionally equivalent 
problem: 


In the first stage of this game, there isa 75% 
chance to end the game without winning 
anything, and a 25% chance to move into 
the second stage. If you reach the second 
stage, you have a choice between: 


C. A sure win of $30 
D. An 80% chance to win $45 


The majority (78%) of participants now pre- 
ferred option C over option D, even though, 
when combined with the “first stage” of the 
problem, options C and D are equivalent to 
A and B, respectively. Majority preference 
thus reverses as a function of a supposedly 
irrelevant contextual variation. In this par- 
ticular case, the reversal is due to the impact 
of apparent certainty (which renders option 
C more attractive) and to another important 
factor, namely, people’s tendency to contem- 
plate decisions from a “local” rather than a 
“global” perspective. Note that a combina- 
tion of the two stages in the last problem 
would have easily yielded the same repre- 
sentation as that of the preceding version. 
However, rather than amalgamating across 
events and decisions, as is often assumed 
in normative analyses, people tend to con- 
template each decision separately, which can 
yield conflicting attitudes across choices. We 
return to the issue of local versus global per- 
spectives in a later section. 

As a further example of framing, it is 
interesting to note that, even within the 
domain of losses, risk attitudes can re- 
verse depending on the context of deci- 
sion. Thus, participants actually tend to 
prefer a sure loss to a risky prospect 
when the sure loss is described as “insur- 
ance” against a low-probability, high-stakes 
loss (Hershey & Schoemaker, 1980). The 
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a social norm, making the insurance pre- 
mium appear more like an investment than 
a loss, with the low-probability, high-stakes 
loss acquiring the character of a neglected 
responsibility rather than a considered risk 
(e.g., Hershey & Schoemaker, 1980; Kahne- 
man & Tversky, 1979; Slovic, Fischhoff, & 
Lichtenstein, 1988). 

The framing of certainty and risk also im- 
pacts people’s thinking about financial trans- 
actions through inflationary times, as illus- 
trated by the following example. Participants 
were asked to imagine that they were 
in charge of buying computers (currently 
priced at $1000) that would be delivered and 
paid for 1 year later, by which time, due to 
inflation, prices were expected to be approx- 
imately 20% higher (and equally likely to be 
above or below the projected 20%). All par- 
ticipants essentially faced the same choice: 
They could agree to pay either $1200 (20% 
more than the current price) upon delivery 
next year, or they could agree to pay the 
going market price in 1 year, which would 
depend on inflation. Reference points were 
manipulated to make one option appear cer- 
tain while the other appeared risky: Half 
the participants saw the contracts framed 
in nominal terms so the $1200 price ap- 
peared certain, whereas the future nominal 
market price (which could be more or less 
than $1200) appeared risky. Other partici- 
pants saw the contracts framed in real terms, 
so the future market price appeared appro- 
priately indexed, whereas precommitting to 
a $1200 price, which could be lower or 
higher than the actual future market price, 
seemed risky. As predicted, in both con- 
ditions respondents preferred the contract 
that appeared certain, preferring the fixed 
price in the nominal frame and the indexed 
price in the “real” frame (Shafir, Diamond, & 
Tversky, 1997). As with many psychological 
tendencies, the preference for certainty can 
mislead in some circumstances, but it may 
also be exploited for beneficial ends, such as 
when the certainty associated with a partic- 
ular settlement is highlighted to boost the 
chance for conflict resolution (Kahneman & 
Tversky, 1995). 


Not all decisions involve risk or uncertainty. 
For example, when choosing between items 
in a store, we can be fairly confident that 
the displayed items are available. (Naturally, 
there could be substantial uncertainty about 
one’s eventual satisfaction with the choice, 
but we leave those considerations aside for 
the moment.) The absence of uncertainty, 
however, does not eliminate preference mal- 
leability, and many of the principles dis- 
cussed previously continue to exert an im- 
pact even on riskless decisions. Recall that 
outcomes can be framed as gains or as losses 
relative to a reference point, that losses typ- 
ically “loom larger” than comparable gains, 
and that people tend to accept the presented 
frame. These factors, even in the absence of 
risk, can yield normatively problematic de- 
cision patterns. 


Loss Aversion and the Status Quo 


A fundamental fact about the making of de- 
cisions is loss aversion: According to loss 
aversion, the pain associated with giving 
up a good is greater than the pleasure 
associated with obtaining it (Tversky & 
Kahneman, 1991). This yields “endowment 
effects,” wherein the mere possession of a 
good (such that parting with it is rendered 
a loss) can lead to higher valuation of the 
good than if it were not in one’s possession. 
A classic experiment illustrates this point 
(Kahneman, Knetsch, & Thaler, 1990). Par- 
ticipants were arbitrarily assigned to be sell- 
ers or choosers. The sellers were each given an 
attractive mug, which they could keep, and 
were asked to indicate the lowest amount for 
which they would sell the mug. The choosers 
were not given a mug but were instead asked 
to indicate the amount of money that the 
mug was worth to them. Additional pro- 
cedural details were designed to promote 
truthful estimates; in short, an official mar- 
ket price, $X, was to be revealed; all those 
who valued the mug at more than $X re- 
ceived a mug, whereas those who valued the 
mug below $X received $X. All participants, 
whether sellers or choosers, essentially faced 
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they would prefer money over the mug. Be- 
cause participants were randomly assigned 
to be sellers or choosers, standard expecta- 
tions are that the two groups would value the 
mugs similarly. Loss aversion, however, sug- 
gests that the sellers would set a higher price 
(for what they were about to “lose”) than 
the choosers. Indeed, sellers’ median asking 
price was twice that of choosers. 

Another manifestation of loss aversion is 
a general reluctance to trade, illustrated in 
a study in which one-half of the subjects 
were given a decorated mug, whereas the 
others were given a bar of Swiss choco- 
late (Knetsch, 1989). Later, each subject was 
shown the alternative gift and offered the 
opportunity to trade his or her gift for the 
other. Because the initial allocation of gifts 
was arbitrary and transaction costs minimal, 
economic theory predicts that about one- 
half the participants would exchange their 
gifts. Loss aversion, however, predicts that 
most participants would be reluctant to give 
up a gift in their possession (a loss) to ob- 
tain the other (a gain). Indeed, only 10% 
of the participants chose to trade. This con- 
trasts sharply with standard analysis in which 
the value of a good does not change when it 
becomes part of one’s endowment. 

Loss aversion thus promotes. stability 
rather than change. It implies that people 
will not accept an even chance to win or lose 
$X, because the loss of $X is more aversive 
than the gain of $X is attractive. In particular, 
it predicts a strong tendency to maintain the 
status quo because the disadvantages of de- 
parting from it loom larger than the advan- 
tages of its alternative (Samuelson & Zeck- 
hauser, 1988). A striking tendency to main- 
tain the status quo was observed in the con- 
text of insurance decisions when New Jersey 
and Pennsylvania both introduced the op- 
tion of a limited right to sue, entitling auto- 
mobile drivers to lower insurance rates. The 
two states differed in what they offered con- 
sumers as the default option: New Jersey 
motorists had to acquire the full right to 
sue (transaction costs were minimal: a signa- 
ture), whereas in Pennsylvania, the full right 
was the default, which could be forfeited 


by diittpssy Aaatitnacyictavor of the limited alternative. Whereas 


only about 20% of New Jersey drivers chose 
to acquire the full right to sue, approxi- 
mately 75% of Pennsylvania drivers chose to 
retain it. The difference in adoption rates 
resulting from the alternate defaults had 
financial repercussions estimated at nearly 
$200 million Johnson, Hershey, Meszaros, 
& Kunreuther, 1993). Another naturally oc- 
curring “experiment” was more recently ob- 
served in Europeans’ choices to be potential 
organ donors (Johnson & Goldstein, 2003). 

In some European nations drivers are by Ee 
fault organ donors unless they elect not to be, 
whereas in other European nations they are, 
by default, not donors unless they choose to 
be. Observed rates of organ donors are al- 
most 98% in the former nations and about 
15% in the latter, a remarkable difference 
given the low transaction costs and the sig- 
nificance of the decision. 

For another example, consider two candi- 
dates, Frank and Carl, who are running for 
election during difficult times and have an- 
nounced target inflation and unemployment 
figures. Frank proposes a 42% yearly infla- 
tion rate and 15% unemployment, whereas 
Carl envisions 23% inflation and 22% un- 
employment. When Carl’s figures repre- 
sent the status quo, Frank’s plans entail 
greater inflation and diminished unemploy- 
ment, whereas when Frank’s figures are the 
status quo, Carl’s plan entails lower inflation 
and greater unemployment. As predicted, 
neither departure from the “current” state 
was endorsed by the majority of respon- 
dents, who preferred whichever candidate 
was said to represent the status quo (Quat- 
trone & Tversky, 1988). 

The status quo bias can affect decisions 
in domains as disparate as job selection 
(Tversky & Kahneman, 1991), investment al- 
location (Samuelson & Zeckhauser, 1988), 
and organ donation (Johnson & Goldstein, 
2003), and it can also hinder the negotiated 
resolution of disputes. If each disputant sees 
the opponent’s concessions as gains but its 
own concessions as losses, agreement will be 
hard to reach because each will perceive it- 
self as relinquishing more than it stands to 
gain. Because loss aversion renders foregone 
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(cf. Kahneman, 1992), an insightful medi- 
ator may do best to set all sides’ refer 
ence points low, thus requiring compromises 
over outcomes that are mostly perceived 
as gains. 


Semantic Framing 


The tendency to adopt the provided 
frame can lead to “attribute-framing” effects 
(Levin, Schneider, & Gaeth, 1998). A pack- 
age of ground beef, for example, can be 
described as 75% lean or else as 25% fat. 
Not surprisingly, it tends to be evaluated 
more favorably under the former description 
than the latter (Levin, 1987; see also Levin, 
Schnittjer, & Thee, 1988). Similarly, a com- 
munity with a 3.7% crime rate tends to be 
allocated greater police resources than one 
described as 96.3% “crime free” (Quattrone 
& Tversky, 1988). Attribute-framing effects 
are not limited to riskless choice; for exam- 
ple, people are more favorably inclined to- 
ward a medical procedure when its chance 
of success, rather than failure, is highlighted 
(Levin et al., 1988). 

Attribute-framing manipulations affect 
the perceived quality of items by changing 
their descriptions. Part of the impact of such 
semantic factors may be due to spreading ac- 
tivation (Collins & Loftus, 1975), wherein 
positive words (e.g., “crime-free”) activate 
associated positive concepts, and negative 
words activate negative concepts. The psy- 
chophysical properties of numbers also con- 
tribute to these effects. A 96.3% “crime 
free” rate, for example, appears insubstan- 
tially different from 100% and suggests that 
“virtually all” are law abiding. The difference 
between 0% and 3.7%, in contrast, appears 
more substantial and suggests the need for 
intervention (Quattrone & Tversky, 1988). 
Like the risk attitudes previously described, 
such perceptual effects often seem natural 
and harmless in their own right but can 
generate preference inconsistencies that ap- 
pear perplexing, especially given the rather 
mild and often unavoidable manipulations 
(after all, things need to be described one 
way or another) and the trivial computations 


to another. 


Conflict and Reasons 


Choices can be hard to make. People often 
approach difficult decisions by looking for 
a compelling rationale for choosing one op- 
tion over another. At times, compelling ra- 
tionales are easy to come by and to articulate, 
whereas other times no compelling ratio- 
nale presents itself, rendering the conflict be- 
tween options hard to resolve. Such conflict 
can be aversive and can lead people to post- 
pone the decision or to select a “default” 
alternative. The tendency to rely on com- 
pelling rationales that help minimize conflict 
appears benign; nonetheless, it can generate 
preference patterns that are fundamentally 
different from those predicted by normative 
accounts based on value maximization. 


Decisional Conflict 


One way to avoid conflict in choice is to 
opt for what appears to be no choice at 
all, namely, the status quo. In one exam- 
ple (Tversky & Shafir, 1992a), participants 
who were purportedly looking to buy a CD 
player were presented with a Sony player 
that was on a 1-day sale for $99, well below 
the list price. Two-thirds of the participants 
said they would buy such a CD player. An- 
other group was presented with the same 
Sony player and also with a top-of-the-line 
Aiwa player for $159. In the latter case, only 
54% expressed interest in buying either op- 
tion, and a full 46% preferred to wait until 
they learned more about the various mod- 
els. The addition of an attractive option in- 
creased conflict and diminished the number 
who ended up with either player, despite the 
fact that most preferred the initial alterna- 
tive to the status quo. This violates what is 
known as the regularity condition, according 
to which the “market share” of an existing 
option — here, the status quo — cannot be in- 
creased by enlarging the offered set (see also 
Tversky & Simonson, 1993). 
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tasting booths in an upscale grocery store, 
where shoppers were offered the opportu- 
nity to taste any of 6 jams in one condition, 
or any of 24 jams in the second (Iyengar 
& Lepper, 2000). In the 6-jams condition, 
40% of shoppers stopped to have a taste 
and, of those, 30% proceeded to purchase 
a jam. In the 24-jam condition, a full 60% 
stopped to taste, but only 3% purchased. 
Presumably, the conflict between so many 
attractive options proved hard to resolve. 
Further studies found that those choos- 
ing goods (eg., chocolate) from a larger 
set later reported lower satisfaction with 
their selections than those choosing from 
a smaller set. Conflict among options thus 
appears to make people less happy about 
choosing, as well as less happy with their 
eventual choices. 

Decisional conflict tends to favor default 
alternatives, much as it advantages the sta- 
tus quo. In one study, 80 students agreed to 
fill out a questionnaire in return for $1.50. 
Following the questionnaire, one-half of the 
respondents were offered the opportunity to 
exchange the $1.50 (the default) for one of 
two prizes: a metal Zebra pen, or a pair of 
plastic Pilot pens. The remaining subjects 
were only offered the opportunity to ex- 
change the $1.50 for the Zebra. The pens 
were shown to subjects, who were informed 
that each prize regularly costs just over 
$2.00. The results were as follows. Twenty- 
five percent opted for the payment over the 
Zebra when Zebra was the only alternative, 
but a reliably greater 53% chose the pay- 
ment over the Zebra or the Pilot pens when 
both options were offered (Tversky & Shafir, 
1992a). Whereas the majority of subjects 
took advantage of the opportunity to obtain 
a valuable alternative when only one was of- 
fered, the availability of competing valuable 
alternatives increased the tendency to retain 
the default option. 

Related effects have been documented 
in decisions made by expert physicians and 
legislators (Redelmeier & Shafir, 1995). In 
one scenario, neurologists and neurosur- 
geons were asked to decide which of several 


ated on first. Half the respondents were pre- 
sented with two patients, a woman in her 
early fifties and a man in his seventies. Others 
saw the same two patients along with a third, 
a woman in her early fifties highly compara- 
ble to the first, so it was difficult to think of a 
rationale for choosing either woman over the 
other. As predicted, more physicians (58%) 
chose to operate on the older man in the 
latter version, where the two highly compa- 
rable women presented decisional conflict, 
than in the former version (38%), in which 
the choice was between only one younger 
woman and the man. 

The addition of some options can gen- 
erate conflict and increase the tendency to 
refrain from choosing. Other options, how- 
ever, can lower conflict and increase the like- 
lihood of making a choice. Asymmetric dom- 
inance refers to the fact that in a choice 
between options A and B, a third option, A’, 
can be added that is clearly inferior to A (but 
not to B), thereby increasing the choice like- 
lihood of A (Huber, Payne, & Puto, 1982). 
For example, a choice between $6 and an 
elegant pen presents some conflict for par- 
ticipants. However, when a less attractive 
pen is added to the choice set, the superior 
pen clearly dominates the inferior pen. This 
dominance provides a rationale for choos- 
ing the elegant alternative and leads to an 
increase in the percentage of those choos- 
ing the elegant pen over the cash. Along 
related lines, the compromise effect occurs 
when the addition of a third, extreme option 
makes a previously available option appear 
as a reasonable compromise, thus increasing 
its popularity (Simonson, 1989; Simonson & 
Tversky, 1992). 

Standard normative accounts do not deny 
conflict, nor, however, do they assume any 
direct influence of conflict on choice. (For 
people who maximize utility, there does not 
appear to be much room for conflict: Ei- 
ther the utility difference is large and the 
decision is easy, or it is small and the de- 
cision is of little import.) In actuality, peo- 
ple are concerned with making the “right” 
choice, which can render decisional conflict 
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value. Conflict is an integral aspect of de- 
cision making, and the phenomenology of 
conflict, which can be manipulated via the 
addition or removal of alternatives, yields 
predictable and systematic violations of stan- 
dard normative predictions. 


Reason-Based Choice 


The desire to make the “right” choice of- 
ten leads people to look for good reasons 
when making decisions, and such reliance 
on reasons helps make sense of phenomena 
that appear puzzling from the perspective 
of value maximization (Shafir, Simonson, 
& Tversky, 1993). Relying on good rea- 
sons seems like sound practice: After all, 
the converse, making a choice without good 
reason, seems unwise. At the same time, 
abiding by this practice can be problem- 
atic because the reasons that come to mind 
are often fleeting, are limited to what is in- 
trospectively accessible, and are not neces- 
sarily those that guide, or ought to guide, 
the decision. For example, participants who 
were asked to analyze why they felt the way 
that they did about a set of jams showed 
less agreement with “expert” ratings of the 
jams than did those who merely stated their 
preferences (Wilson & Schooler, 1991). A 
search for reasons can alter preference in 
line with reasons that come readily to mind, 
but those reasons may be heavily influenced 
by salience, availability, or momentary con- 
text. A heavy focus on a biased set of tem- 
porarily available reasons can cause one to 
lose sight of one’s (perhaps more valid) 
initial feelings (Wilson, Dunn, Kraft, & 
Lisle, 1989). 

Furthermore, a wealth of evidence sug- 
gests that people are not always aware of 
their reasons for acting and deciding (see 
Nisbett & Wilson, 1977). In one example, 
participants presented with four identical 
pairs of stockings and asked to select one 
showed a marked preference for the op- 
tion on the right. However, despite this ev- 
idence that choice was governed by posi- 
tion, no participant mentioned position as 


ily generated “reasons” (in which they cited 
attributes, such as stocking texture), but 
the reasons they provided bore little resem- 
blance to those that actually guided choice 
(Nisbett & Wilson, 1977). 

Finally, and perhaps most normatively 
troubling, a reliance on reasons can induce 
preference inconsistencies because nuances 
in decisional context can render certain rea- 
sons more or less apparent. In one study 
(Tversky & Shafir, 1992b), college students 
were asked to imagine that they had just 
taken and passed a difficult exam and now 
had a choice for the Christmas holidays: 
They could buy an attractive vacation pack- 
age at a low price, they could forego the va- 
cation package, or they could pay a $5 fee 
to defer the decision by a day. The major- 
ity elected to buy the vacation package, and 
less than one-third elected to delay the deci- 
sion. A second group was asked to imagine 
that they had taken the exam and failed and 
would need to retake it after the Christmas 
holidays. They were then presented with 
the same choice and, as before, the major- 
ity elected to buy the vacation package; less 
than one-third preferred to defer. However, 
when a third group of participants was to 
imagine they did not know whether they 
had passed or failed the exam, the major- 
ity preferred to pay to defer the decision 
until the next day, when the exam result 
would be known, and only a minority was 
willing to commit to the trip without know- 
ing. Apparently, participants were comfort- 
able booking the trip when they had clear 
reasons for the decision — celebrating when 
they passed the exam or recuperating when 
they had failed — but were reluctant to com- 
mit when their reasons for the trip were 
uncertain. This pattern, which violates the 
sure thing principle (Savage, 1954), has been 
documented in a variety of contexts, includ- 
ing gambling and strategic interactions (e.g., 
prisoner’s dilemmas; see also Shafir, 1994; 
Shafir & Tversky, 1992). 

The tendency to delay decision for the 
sake of further information can have a 
significant impact on the ensuing choice. 
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Shafir, 1998): 


For some time, you have considered adding 
a compact disc (CD) player to your stereo 
system. You now see an ad for a week-long 
sale offering a very good CD player for only 
$120, 50% off the retail price. Recently, 
however, your amplifier broke. You learn 
that your warranty has expired and that 
you have to pay $90 for repairs. 


One group (the “simple” condition) was 
asked whether they would buy the CD 
player during the sale, and the vast major- 
ity (91%) said they would. Another (“uncer- 
tain”) group was presented with the same 
scenario, but was told that they would not 
know until the next day whether the war- 
ranty covered the $90 repairs. They could 
wait until the following day (when they 
would know about the warranty) to de- 
cide whether to buy the CD player; 69% 
elected to wait. Those who chose to wait 
then learned that the warranty had expired 
and would not cover repairs; upon receiv- 
ing the news, the majority decided not to 
buy the CD player. Note that this contrasts 
sharply with the unequivocal choice to buy 
the CD player when the $90 repair costs 
were a given. Although they faced the same 
decision, only 55% (including those who 
waited and those who did not) chose to 
buy the CD player in the uncertain con- 
dition, when they did not know but could 
pursue information about the repair costs, 
compared with 91% in the certain condi- 
tion, when repair costs were known from 
the start. The decision to pursue informa- 
tion can focus attention on the information 
obtained and thereby trigger emergent ratio- 
nales for making the choice, ultimately dis- 
torting preference (Bastardi & Shafir, 1998). 
Similar patterns have been replicated in a 
variety of contexts, including one involving 
professional nurses in a renal failure ward, 
more of whom expressed willingness to do- 
nate a kidney (to a hypothetical relative) 
when they had purportedly been tested and 
learned that they were eligible than when 
they had known they were eligible from the 
start (Redelmeier, Shafir, & Aujla, 2001). A 


makers susceptible to a variety of contextual 
and procedural nuances that render alterna- 
tive potential reasons salient and thus may 
lead to inconsistent choices. 


Processing of Attribute Weights 


Choices can be complex, requiring the eval- 
uation of multiattribute options. Consider, 
for example, a choice between two job 
candidates: One candidate did well in school 
but has relatively unimpressive work ex- 
perience and moderate letters of recom- 
mendation, whereas the other has a poor 
scholastic record but better experience and 
stronger letters. To make this choice, the de- 
cision maker must somehow combine the 
attribute information, which requires deter- 
mining not only the quality or value of each 
attribute, but also the extent to which a 
shortcoming on one attribute can be com- 
pensated for by strength on another. 

Attribute evaluation may be biased by 
a host of factors known to hold sway over 
human judgment (for a review, see Kahne- 
man & Frederick, Chap. 12). Moreover, re- 
searchers have long known that people have 
limited capacity for combining information 
across attributes. Because of unreliable at- 
tribute weights in human judges, simple lin- 
ear models tend to yield normatively better 
predictions than the very judges on whom 
the models are based (Dawes, 1979; Dawes, 
Faust, & Meehl, 1989). In fact, people’s 
unreliable weighting of attributes makes 
them susceptible to a host of manipulations 
that alter attribute weights and yield con- 
flicting preferences (see Shafir & LeBoeuf, 
2004, for a further discussion of multiattri- 
bute choice). 


Compatibility 


Options can vary on several dimensions. 
Even simple monetary gambles, for exam- 
ple, differ on payoffs and the chance to win. 
Respondents’ preferences among such gam- 
bles can be assessed in different but logically 
equivalent, ways (see Schkade & Johnson, 
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pants may be asked to choose among the 
gambles or, alternatively, they may estimate 
their maximum willingness to pay for each 
gamble. Notably, these procedures, although 
logically equivalent, often result in differ- 
ential weightings of attributes and, conse- 
quently, in inconsistent preferences. 

Consider two gambles: One offers an 
eight-in-nine chance to win $4 and the other 
a one-in-nine chance to win $40. People 
typically choose the high-probability gamble 
but assign a higher price to the high-payoff 
gamble, thus expressing conflicting prefer- 
ences (Grether & Plott, 1979; Lichtenstein 
& Slovic, 1971, 1973; Tversky, Slovic, & 
Kahneman, 1990). This pattern illustrates 
the principle of compatibility, according to 
which an attribute’s weight is enhanced by 
its compatibility with the response mode 
(Slovic, Griffin, & Tversky, 1990; Tversky, 
Sattath, & Slovic, 1988). In particular, a gam- 
ble’s potential payoff is weighted more heav- 
ily in pricing, where both the price and 
the payoff are in the same monetary units, 
than in choice, where neither attribute maps 
onto the response scale (Schkade & Johnson, 
1989). As a consequence, the high-payoff 
gamble is valued more in pricing relative 
to choice. 

For another type of response compati- 
bility, imagine having to choose or, alter- 
natively, having to reject, one of two op- 
tions. Logically speaking, the two tasks are 
interchangeable: If people prefer one option, 
they will reject the second, and vice versa. 
However, people tend to focus on the rel- 
ative strengths of options (more compati- 
ble with choosing) when they choose, and 
on weaknesses (compatible with rejecting) 
when they reject. As a result, options’ posi- 
tive features (the pros) loom larger in choice, 
whereas their negative features (the cons) 
are weighted relatively more during rejec- 
tion. In one study, respondents were pre- 
sented with pairs of options — an enriched 
option, with various positive and negative 
features, and an impoverished option, with 
no real positive or negative features (Shafir, 
1993). For example, consider two vacation 
destinations: one with a variety of positive 


beaches and great sunshine but cold water 
and strong winds, and another that is neutral 
in all respects. Some respondents were asked 
which destination they preferred; others de- 
cided which to forego. Because positive fea- 
tures are weighed more heavily in choice and 
negative features matter relatively more dur 
ing rejection, the enriched destination was 
most frequently chosen and rejected. Over- 
all, its choice and rejection rates summed to 
115%, significantly more than the impover- 
ished destination’s 85%, and more than the 
100% expected if choice and rejection were 
complementary (see also Downs & Shafir, 


1999; Wedell, 1997). 


Separate Versus Comparative Evaluation 


Decision contexts can facilitate or ham- 
per attribute evaluation, and this can alter 
attribute weights. Not surprisingly, an at- 
tribute whose value is clear can have greater 
impact than an attribute whose value is 
vague. The effects of ease of evaluation, re- 
ferred to as “evaluability,” occur, for exam- 
ple, when an attribute proves difficult to 
gauge in isolation but easier to evaluate in 
a comparative setting (Hsee, 1996; Hsee, 
Loewenstein, Blount, & Bazerman, 1999). In 
one study, subjects were presented with two 
second-hand music dictionaries: one with 
20,000 entries but a slightly torn cover, and 
the other with 10,000 entries and an un- 
blemished cover. Subjects had only a vague 
notion of how many entries to expect in a 
music dictionary; when they saw these one 
at a time, they were willing to pay more for 
the dictionary with the new cover than for 
the one with a cover that was slightly torn. 
When the dictionaries were evaluated con- 
currently, however, the number-of-entries 
attribute became salient: Most subjects ob- 
viously preferred the dictionary with more 
entries, despite the inferior cover. 

For another example, consider a job that 
pays $80,000 a year at a firm where one’s 
peers receive $100,000, compared with a job 
that pays $70,000 while coworkers are paid 
$5 0,000. Consistent with the fact that most 
people prefer higher incomes, a majority of 
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the two options preferred the job with the 
higher absolute — despite the lower rela- 
tive — income. When the jobs are contem- 
plated separately, however, the precise mer- 
its of one’s own salary are hard to gauge, 
but earning less than comparable others ren- 
ders the former job relatively less attractive 
than the latter, where one’s salary exceeds 
one’s peers’. Indeed, the majority of MBA 
students who evaluated the two jobs sepa- 
rately anticipated higher satisfaction in the 
job with the lower salary but the higher rela- 
tive position, obviously putting more weight 
on the latter attribute in the context of sep- 
arate evaluation (Bazerman, Schroth, Shah, 
Diekmann, & Tenbrunsel, 1994). 

In the same vein, decision principles that 
are hard to apply in isolated evaluation may 
prove decisive in comparative settings, pro- 
ducing systematic fluctuations in attribute 
weights. Kahneman and Ritov (1994), for 
example, asked participants about their will- 
ingness to contribute to several environmen- 
tal programs. One program was geared to- 
ward saving dolphins in the Mediterranean 
Sea; another funded free medical check- 
ups for farm workers at risk for skin can- 
cer. When asked which program they would 
rather support, the vast majority chose the 
medical checkups for farm workers, pre- 
sumably following the principle that human 
lives come before those of animals. How- 
ever, when asked separately for the largest 
amount they would be willing to pay for 
each intervention, respondents, moved by 
the animals’ vivid plight, were willing to 
pay more for the dolphins than for work- 
ers’ checkups. In a similar application, po- 
tential jurors awarded comparable dollar 
amounts to plaintiffs who had suffered ei- 
ther physical or financial harm, as long as the 
cases were evaluated separately. However, in 
concurrent evaluation, award amounts in- 
creased dramatically when the harm was 
physical as opposed to financial, affirming 
the notion that personal harm is the graver 
offense (Sunstein, Kahneman, Schkade, & 
Ritov, 2001). 

Attribute weights, which are normatively 
assumed to remain stable, systematically 
shift and give rise to patterns of inconsistent 


separate versus concurrent evaluation have 
profound implications for intuition and for 
policy. Outcomes in life are typically experi- 
enced one at a time: A person lives through 
one scenario or another. Normative intu- 
itions, however, typically arise from concur- 
rent introspection: We entertain a scenario 
along with its alternatives. When an event 
triggers reactions that stem from its being 
experienced in isolation, important aspects 
of the experience will be misconstrued by 
intuitions that arise from concurrent evalu- 
ation (see Shafir, 2002). 


Local Versus Global Perspectives 


Many of the inconsistency patterns de- 
scribed previously would not have arisen 
were decisions considered from a more 
global perspective. The framing of decisions, 
for instance, would be of little consequence 
were people to go beyond the provided 
frame to represent the decision outcomes in 
a canonical manner that is description inde- 
pendent. Instead, people tend to accept the 
decision problem as it is presented, largely 
because they may not have thought of other 
ways to look at the decision, and also be- 
cause they may not expect their preferences 
to be susceptible to presumably incidental 
alterations. (Note that even if they were to 
recognize the existence of multiple perspec- 
tives, people may still not know how to ar- 
rive at a preference independent of a spe- 
cific formulation; cf Kahneman, 2003). In 
this final section, we review several addi- 
tional decision contexts in which a limited 
or myopic approach is seen to guide deci- 
sion making, and inconsistent preferences 
arise as a result of a failure to adopt a more 
“global” perspective. Such a perspective re- 
quires one to ignore momentarily salient fea- 
tures of the decision in favor of other, often 
less salient, considerations that have long- 
run consequences. 


Repeated Decisions 


Decisions that occur on a regular basis are 
often more meaningful when evaluated “in 
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or to exercise makes little difference on any 
one day and can only be carried out under a 
long-term perspective that trumps the per- 
son’s short-term preferences for cake over 
vegetables or for sleeping late rather than go- 
ing to the gym early. People, however, often 
do not take this long-term perspective when 
evaluating instances of a recurring choice; in- 
stead, they tend to treat each choice as an 
isolated event. 

In one study, participants were offered a 
50% chance to win $2000 and a 50% chance 
to lose $500. Although most participants re- 
fused to play this gamble once, the major 
ity were eager to play the gamble five times, 
and, when given the choice, preferred to play 
the gamble six times rather than five. Appar- 
ently, fear of possibly losing the single gam- 
ble is compensated for by the high likelihood 
of ending up ahead in the repeated version. 
Other participants were asked to imagine 
that they had already played the gamble five 
times (outcome as yet unknown) and were 
given the option to play once more. In this 
formulation, a majority of participants re- 
jected the additional play. Although partici- 
pants preferred to play the gamble six times 
rather than five, once they had finished play- 
ing five, the additional opportunity was im- 
mediately “segregated” and treated as a single 
instance, which — as we know from the sin- 
gle gamble version — participants preferred 
to avoid (Redelmeier & Tversky, 1992). 

In a related vein, consider physicians, 
who can think of their patients “individ- 
ually” (i.e, patient by patient) or “glob- 
ally” (e.g., as groups of patients with simi- 
lar problems). In several studies, Redelmeier 
and Tversky (1990) found that physicians 
were more likely to take “extra measures,” 
such as ordering an expensive medical test 
or recommending an in-person consultation, 
when they considered the treatment of an 
individual patient than when they consid- 
ered a larger group of similarly afflicted 
patients. Personal concerns loomed larger 
when patients were considered individually 
than when “patients in general” were con- 
sidered, with the latter group more likely to 
highlight efficiency concerns. Because physi- 
cians tend to see patients one at a time, this 


is inconsistent with what these physicians 
would endorse from a more global perspec- 
tive. For a more mundane example, people 
report greater willingness to wear a seatbelt — 
and to support proseatbelt legislation — when 
they are shown statistics concerning the life- 
time risk of being in a fatal accident instead 
of the dramatically lower risk associated with 
any single auto trip (Slovic et al., 1988). 

Similar patterns prompted Kahneman 
and Lovallo (1993) to argue that decision 
makers often err by treating each decision as 
unique rather than categorizing it as one in 
a series of similar decisions made over a life- 
time (or, in the case of corporations, made by 
many workers). They distinguish an “inside 
view” of situations and plans, characterized 
by a focus on the peculiarities of the case at 
hand, from an “outside view,” guided by an 
analysis of a large number of similar cases. 
Whereas an outside view, based, for exam- 
ple, on base rates, typically leads to a more 
accurate evaluation of the current case, peo- 
ple routinely adopt an inside view, which 
typically overweighs the particulars of the 
given case at the expense of base-rate con- 
siderations. Managers, for example, despite 
knowing that past product launches have 
routinely run over budget and behind sched- 
ule, may convince themselves that this time 
will be different because the team is excel- 
lent or the product exceptional. The inside 
view can generate overconfidence (Kahne- 
man & Lovallo, 1993), as well as undue op- 
timism, for example, regarding the chances 
of completing projects by early deadlines 
(eg., the planning fallacy; Buehler, Griffin, & 
Ross, 1994). The myopia that emerges from 
treating repeated decisions as unique leads 
to overly bold predictions and to the neglect 
of considerations that ought to matter in the 
long run. 


Mental Accounting 


Specific forms of myopia arise in the con- 
text of “mental accounting,” the behav- 
ioral equivalent of accounting done by firms 
wherein people reason about and make 
decisions concerning matters such as in- 
come, spending, and savings. Contrary to 
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which money in one account, or from one 
source, is a perfect substitute for money in 
another, it turns out that the labeling of ac- 
counts and the nature of transactions have 
a significant impact on people’s decisions 
(Thaler, 1999). For one example, people’s re- 
ported willingness to spend $25 on a theater 
ticket is unaffected by having incurred a $50 
parking ticket but is significantly lowered 
when $50 is spent on a ticket to a sporting 
event (Heath & Soll, 1996). Respondents ap- 
parently bracket expenses into separate ac- 
counts so spending on entertainment is im- 
pacted by a previous entertainment expense 
in a way that it is not if that same expense 
is “allocated” to, say, travel. Along similar 
lines, people who had just lost a $10 bill were 
happy to buy a $10 ticket for a play but were 
less willing to buy the ticket if, instead of the 
money, they had just lost a similar $10 ticket 
(Tversky & Kahneman, 1981). Apparently, 
participants were willing to spend $10 on a 
play even after losing $10 cash but found it 
aversive to spend what was coded as $20 on 
a ticket. 

Finally, consider the following scenario, 
which respondents saw in one of two 
versions: 


Imagine that you are about to purchase 
a jacket for $125 [$15] and a calculator 
for $15 [$125]. The calculator salesman 
informs you that the calculator you want 
to buy is on sale for $10 [$120] at the 
other branch of the store, located 20 min- 
utes drive away. Would you make the trip 
to the other store? (Tversky & Kahneman, 


1981, p. 457) 


Faced with the opportunity to save $5 on 
a $15 calculator, a majority of respondents 
agreed to make the trip. However, when 
the calculator sold for $125, only a minor 
ity was willing to make the trip for the 
same $5 savings. A global evaluation of ei- 
ther version yields a 20-minute voyage for $5 
savings; people, however, seem to make de- 
cisions based on what has been referred to as 
“topical” accounting (Kahneman & Tversky, 
1984), wherein the same $5 saving is coded 


negligible in the other. 

Specific formulations and contextual de- 
tails are not spontaneously reformulated 
or translated into more comprehensive or 
canonical representations. As a consequence, 
preferences prove highly labile and depen- 
dent on what are often theoretically, as well 
as practically, unimportant and accidental 
details. An extensive literature on mental ac- 
counting, as well as behavioral finance, forms 
part of the growing field of behavioral eco- 
nomics (see, e.g., Camerer, Loewenstein, & 
Rabin, 2004; Thaler 1993, 1999). 


Temporal Discounting 


A nontrivial task is to decide how much 
weight to give to outcomes extended into 
the distant future. Various forms of uncer- 
tainty (regarding nature, one’s own tastes, 
and so on) justify some degree of discount- 
ing in calculating the present value of future 
goods. Thus, $1000 received next year is typ- 
ically worth less than $1000 received today. 
As it turns out, observed discount rates tend 
to be unstable and often influenced by fac- 
tors, such as the size of the good and its tem- 
poral distance, that are not subsumed un- 
der standard normative analyses (see Ainslie, 
2001; Frederick, Loewenstein, & Donoghue, 
2002; Loewenstein & Thaler, 1989, for re- 
view). For example, although some people 
prefer an apple today over two apples to- 
morrow, virtually nobody prefers one apple 
in 30 days over two apples in 31 days (Thaler, 
1981). Because discount functions are non- 
exponential (see also Loewenstein & Prelec, 
1992), a1-day delay has greater impact when 
that day is near than when it is far. Simi- 
larly, when asked what amount of money in 
the future would be comparable to receiv- 
ing a specified amount today, people require 
about $60 in 1 year to match $15 now, but 
they are satisfied with $4000 in a year in- 
stead of $3000 today. This implies discount 
rates of 300% in the first case and of 33% in 
the second. To the extent that one engages 
in a variety of transactions throughout time, 
imposing wildly disparate discount rates 
on smaller versus larger amounts ignores 
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eventually add up to be larger, yielding 
systematic inconsistency. 

Excessive discounting turns into myopia, 
which is often observed in people’s atti- 
tudes toward future outcomes (see, e.g., 
Elster, 1984; Elster & Loewenstein, 1992). 
Loewenstein and Thaler (1989) discussed a 
West Virginia experiment in which the high 
school dropout rate was reduced by one- 
third when dropouts were threatened with 
the loss of their driving privileges. This im- 
mediate consequence apparently had a sig- 
nificantly greater impact than the far more 
serious but more distant socioeconomic im- 
plications of failing to graduate from high 
school. These authors also mention physi- 
cians’ typical lament that warning about the 
risk of skin cancer from excessive sun ex- 
posure has less effect than the warning that 
such exposure can cause large pores and 
acne. In fact, “quit smoking” campaigns have 
begun to stress the immediate benefits of 
quitting (quick reduction in the chance of a 
heart attack, improved ability to taste foods 
within 2 days, and such) even more promi- 
nently than the long-term benefits (Ameri- 
can Lung Association, 2003). Similar reason- 
ing applies in the context of promoting safe 
sex practices and medical self-examinations, 
where immediate gratification or discom- 
fort often trumps much greater, but tempo- 
rally distant, considerations. Schelling (1980, 
1984) thought about similar issues of self- 
control in the face of immediate temptation 
as involving multiple “selves”; it is to related 
considerations of alternate frames of mind 
that we turn next. 


Frames of Mind 


Myopic decisions can occur when highly 
transient frames of mind are momentar- 
ily triggered, highlighting values and desires 
that may not reflect the decision maker’s 
more global preferences. Because choices of- 
ten involve delayed consumption, failure to 
anticipate the labile nature of preferences 
may lead to the selection of later-disliked 
alternatives. 


At the most basic level, transient mindsets 
arise when specific criteria are made mo- 
mentarily salient. Grocery shopping while 
very hungry, for example, is likely to lead to 
purchases that would not have been made 
under normal circumstances (cf. Loewen- 
stein, 1996). In a study of the susceptibil- 
ity to temporary criterion salience, partici- 
pants first received a “word perception test” 
in which either creativity, reliability, or a 
neutral topic was primed. Participants then 
completed an ostensibly unrelated “prod- 
uct impression task” that gauged their opin- 
ions of various cameras. Cameras advertised 
for their creative potential were rated as 
more attractive by those primed for creativ- 
ity than by those exposed to words related 
to reliability or a neutral topic (Bettman & 
Sujan, 1987). Momentary priming thus im- 
pacted ensuing preferences, rendering more 
salient criteria that had not previously been 
considered important, despite the fact that 
product consumption was likely to occur 
long after such momentary criterion salience 
dissipated (see Mandel & Johnson, 2002; 
Verplanken & Holland, 2002; Wright & 
Heath, 2000). 


IDENTITIES 


At a broader level, preferences fluctuate 
along with momentarily salient identities. A 
working woman, for example, might think 
of herself primarily as a mother when in the 
company of her children but may see her 
self primarily as a professional while at work. 
The list of potential identities can be exten- 
sive (Turner, 1985) with some of a person’s 
identities (e.g., “mother”) conjuring up strik- 
ingly different values and ideals from oth- 
ers (e.g., “CEO”). Although choices are typ- 
ically expected to reveal stable and coherent 
preferences that correspond to the wishes 
of the self as a whole, in fact, choice often 
fluctuates in accord with happenstance fluc- 
tuations in identity salience. In one study, 
college students whose “academic” identi- 
ties had been triggered were more likely to 
opt for more academic periodicals (e.g., The 
Economist) than were those whose “socialite” 
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Chinese Americans whose American iden- 
tities were evoked adopted more stereotyp- 
ically American preferences (e.g., for indi- 
viduality and competition over collectivism 
and cooperation) compared with when 
their Chinese identities had been triggered 
(LeBoeuf, 2002; LeBoeuf & Shafir, 2004). 
Preference tends to align with currently 
salient identities, yielding systematic tension 
anytime there is a mismatch between the 
identity that does the choosing and the one 
likely to do the consuming, as when a par- 
ent commits to a late work meeting only to 
regret missing her child’s soccer game once 
back at home. 


EMOTIONS AND DRIVES 


Emotions can have similar effects, influenc- 
ing the momentary evaluation of outcomes, 
and thus choice. The anticipated pain of a 
loss is apparently greater for people in a pos- 
itive mood than for those in a negative mood; 
this leads to greater risk aversion among 
those in a good mood as they strive for 
“mood maintenance” (e.g., Isen, Nygren, & 
Ashby, 1988). Furthermore, risk judgments 
tend to be more pessimistic among people in 
a negative than a positive mood (e.g., John- 
son & Tversky, 1983). However, valence is 
not the sole determinant of an emotion’s 
influence: Anger, a negative emotion, seems 
to increase appraisals of individual control, 
leading to optimistic risk assessment and to 
risk seeking, whereas fear, also a negative 
emotion, is not associated with appraisals of 
control and promotes risk aversion (Lerner 
& Keltner, 2001). 

Emotions, or affect, also influence the 
associations or images that come to mind 
in decision making. Because images can be 
consulted quickly and effortlessly, an “affect 
heuristic” has been proposed with affective 
assessments sometimes guiding decisions 
(Slovic, Finucane, Peters, & MacGregor, 
2002). Furthermore, “anticipatory emo- 
tions” (e.g., emotional reactions to being in 
a risky situation) can influence the cogni- 
tive appraisal of decision situations and can 
affect choice (Loewenstein, Weber, Hsee, 


vations can influence reasoning more gen- 
erally (see Molden & Higgins, Chap. 13). 
Emotion and affect thus influence people’s 
preferences; however, because these senti- 
ments are often transient, such influence 
contributes to reversals of preference as mo- 
mentary emotions and drives fluctuate. 

Inconsistency thus often arises because 
people do not realize that their preferences 
are being momentarily altered by situation- 
ally induced sentiments. Evidence suggests, 
however, that even when people are aware of 
being in the grip of a transient drive or emo- 
tion, they may not be able to “correct” ade- 
quately for that influence. For example, re- 
spondents in one study were asked to predict 
whether they would be more bothered by 
thirst or by hunger if trapped in the wilder- 
ness without water or food. Some answered 
right before exercising (when not especially 
thirsty), whereas others answered immedi- 
ately after exercising (thus, thirsty). Postex- 
ercise, 92% indicated that they would be 
more troubled by thirst than by hunger in 
the wilderness, compared with 61% preexer- 
cise (Van Boven & Loewenstein, 2003). Post- 
exercise, people could easily attribute their 
thirst to the exercise. Nonetheless, when 
imagining how they would feel in another, 
quite different and distant situation, peo- 
ple projected their current thirst. More 
generally, people tend to exhibit “empathy 
gaps,” wherein they underestimate the de- 
gree to which various contextual changes 
will impact their drives, emotions, and 
preferences (e.g., Van Boven, Dunning, & 
Loewenstein, 2000; see also Gilbert, Pinel, 
Wilson, Blumberg, & Wheatley, 1998). This 
can further contribute to myopic decision 
making, for people honor present feelings 
and inclinations not fully appreciating the 
extent to which these may be attributable 
to fairly incidental factors that thus may 
soon dissipate. 


Conclusions and Future Directions 


A review of the behavioral decision-making 
literature shows peoples’ preferences to be 
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by a host of factors not subsumed under 
the compelling and popular normative the- 
ory of choice. People’s preferences are heav- 
ily shaped, among other things, by par- 
ticular perceptions of risk and value, by 
multiple influences on attribute weights, 
by the tendency to avoid decisional con- 
flict and to rely on compelling reasons for 
choice, by salient identities and emotions, 
and by a general tendency to accept deci- 
sion situations as they are described, rarely 
reframing them in alternative, let alone 
canonical, ways. 

It is tempting to attribute many of the 
effects to shallow processing or to a fail- 
ure to consider the decision seriously (see, 
e.g., Grether & Plott, 1979; Smith, 1985; 
see also Shafir & LeBoeuf, 2002, for fur- 
ther review of critiques of the findings). 
After all, it seems plausible that partici- 
pants who consider a problem more care- 
fully might notice that it can be framed in 
alternate ways. This would allow a consid- 
eration of the problem from multiple per 
spectives and perhaps lead to a response 
unbiased by problem frame or other “incon- 
sequential” factors (cf. Sieck & Yates, 1997). 
Evidence suggests, however, that the pat- 
terns documented previously cannot be at- 
tributed to laziness, inexperience, or lack 
of motivation. The same general effects are 
observed when participants are provided 
greater incentives (Grether & Plott, 1979; 
see Camerer & Hogarth, 1999, for a review), 
when they are asked to justify their choices 
(Fagley & Miller, 1987; LeBoeuf & Shafir, 
2003; Levin & Chapman, 1990), when they 
are experienced or expert decision makers 
(Camerer, Babcock, Loewenstein, & Thaler, 
1997; McNeil, Pauker, Sox, & Tversky, 1982; 
Redelmeier & Shafir, 1995; Redelmeier, 
Shafir, & Aujla, 2001), or when they are the 
types (e.g., “high need for cognition”) who 
naturally think more deeply about prob- 
lems (LeBoeuf & Shafir, 2003; Levin, Gaeth, 
Schreiber, & Lauriola, 2002). These findings 
suggest that many of the attitudes triggered 
by specific choice problem frames are at least 
somewhat entrenched, with extra thought 
or effort only serving to render the dominant 


highlighting the need for debiasing (Arkes, 
1991; LeBoeuf & Shafir, 2003; Thaler, 1991). 
Research in decision making is active and 
growing. Among interesting current devel- 
opments, several researchers have argued for 
a greater focus on emotion as a force guid- 
ing decisions (Hsee & Kunreuther, 2000; 
Loewenstein et al., 2001; Rottenstreich & 
Hsee, 2001; Slovic et al., 2002). Others 
are investigating systematic dissociations be- 
tween experienced utility, that is, the he- 
donic experience an option actually brings, 
from decision utility, the utility implied by 
the decision. Such investigations correctly 
point out that, in addition to exhibiting 
consistent preferences, one would also want 
decision makers to choose those options 
that will maximize the quality of experi- 
ence (Kahneman, 1994). As it turns out, 
misprediction of experienced utility is com- 
mon, in part because people misremember 
the hedonic qualities of past events (Kahne- 
man, Fredrickson, Schreiber, & Redelmeier, 
1993), and in part because they fail to antic- 
ipate how enjoyment may be impacted by 
factors such as mere exposure (Kahneman 
& Snell, 1992), the dissipation of satiation 
(Simonson, 1990), and the power of adapta- 
tion, even to dramatic life changes (Gilbert 
et al., 1998; Schkade & Kahneman, 1998). 
An accurate description of human de- 
cision making needs to incorporate those 
and other tendencies not reviewed in this 
chapter, including a variety of other judg- 
mental biases (see Kahneman & Frederick, 
Chap. 12), as well as people’s sensitivity to 
considerations such as fairness (Kahneman, 
Knetsch, & Thaler, 1986a, 1986b; Rabin, 
1993) and sunk costs (Arkes & Blumer, 
1985; Gourville & Soman, 1998). A suc- 
cessful descriptive model must allow for 
violations of normative criteria, such as 
procedure and description invariance, dom- 
inance, regularity, and, occasionally, transi- 
tivity. It must also allow for the eventual 
incorporation of other psychological pro- 
cesses that might impact choice. For ex- 
ample, it has been suggested that taking 
aspiration levels into account may some- 
times predict risky decision making better 
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on reference points (Lopes & Oden, 1999). 
The refinement of descriptive theories is 
an evolving process; however, the product 
that emerges continuously seems quite dis- 
tant from the elegant and optimal normative 
treatment. At the same time, acknowledged 
departures from the normative theory need 
not weaken that theory’s normative force. 
After all, normative theories are themselves 
empirical projects, capturing what people 
consider ideal: As we improve our under- 
standing of how decisions are made, we 
may be able to formulate prescriptive pro- 
cedures to guide decision makers, in light of 
their limitations, to better capture their nor 
mative wishes. 

Of course, there are instances in which 
people have very clear preferences that no 
amount of subtle manipulation will alter (cf 
Payne, Bettman, & Johnson, 1992). At other 
times, we appear to be at the mercy of fac- 
tors that we would often like to consider 
inconsequential. This conclusion, well ac- 
cepted within psychology, is becoming in- 
creasingly influential not only in decision re- 
search, but also in the social sciences more 
generally, with prominent researchers in law, 
medicine, sociology, and economics exhort- 
ing their fields to pay attention to findings of 
the sort reviewed here in formulating new 
ways of thinking about and predicting be- 
havior. Given the academic, personal, and 
practical import of decision making, such de- 
velopments may prove vital to our under- 
standing of why people think, act, and de- 
cide as they do. 
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CHAPTER 12 


A Model of Heuristic Judgment 


Daniel Kahneman 
Shane Frederick 


The program of research now known as the 
heuristics and biases approach began with a 
study of the statistical intuitions of experts, 
who were found to be excessively confi- 
dent in the replicability of results from small 
samples (Tversky & Kahneman, 1971). The 
persistence of such systematic errors in the 
intuitions of experts implied that their intu- 
itive judgments may be governed by funda- 
mentally different processes than the slower, 
more deliberate computations they had been 
trained to execute. 

From its earliest days, the heuristics and 
biases program was guided by the idea that 
intuitive judgments occupy a position — per- 
haps corresponding to evolutionary history — 
between the automatic parallel operations 
of perception and the controlled serial op- 
erations of reasoning. Intuitive judgments 
were viewed as an extension of percep- 
tion to judgment objects that are not cur- 
rently present, including mental represen- 
tations that are evoked by language. The 
mental representations on which intuitive 
judgments operate are similar to percepts. 
Indeed, the distinction between perception 
and judgment is often blurry: The perception 


of a stranger as menacing entails a prediction 
of future harm. 

The ancient idea that cognitive processes 
can be partitioned into two main families — 
traditionally called intuition and reason — 
is now widely embraced under the general 
label of dual-process theories (Chaiken & 
Trope, 1999; Evans and Over, 1996; Ham- 
mond, 1996; Sloman, 1996, 2002; see Evans, 
Chap. 8). Dual-process models come in 
many flavors, but all distinguish cognitive 
operations that are quick and associative 
from others that are slow and governed by 
rules (Gilbert, 1999). 

To represent intuitive and deliberate rea- 
soning, we borrow the terms “system 1” and 
“system 2” from Stanovich and West (2002). 
Although suggesting two autonomous ho- 
munculi, such a meaning is not intended. 
We use the term “system” only as a label for 
collections of cognitive processes that can 
be distinguished by their speed, their con- 
trollability, and the contents on which they 
operate. In the particular dual-process model 
we assume, system 1 quickly proposes intu- 
itive answers to judgment problems as they 
arise, and system 2 monitors the quality of 


267 


268 THE CAMBRIDGE HANDBOOK OF THINKING AND REASONING 


these proposkid ventas hitps Uissiieianargocdim the context of a dual-system view, 


rect, or override. The judgments that are 
eventually expressed are called intuitive if 
they retain the hypothesized initial proposal 
with little modification. 

The effect of concurrent cognitive tasks 
provides the most useful indication of 
whether a given mental process belongs to 
system 1 or system 2. Because the over- 
all capacity for mental effort is limited, ef- 
fortful processes tend to disrupt each other, 
whereas effortless processes neither cause 
nor suffer much interference when com- 
bined with other tasks (Kahneman, 1973; 
Pashler, 1998). It is by this criterion that we 
assign the monitoring function to system 2: 
People who are occupied by a demanding 
mental activity (e.g., attempting to hold in 
mind several digits) are much more likely 
to respond to another task by blurting out 
whatever comes to mind (Gilbert, 1989). By 
the same criterion, the acquisition of highly 
skilled performances — whether perceptual 
or motor — involves the transformation of an 
activity from effortful (system 2) to effort- 
less (system 1). The proverbial chess master 
who strolls past a game and quips, “White 
mates in three” is performing intuitively 
(Simon & Chase, 1973). 

Our views about the two systems are 
similar to the “correction model” proposed 
by Gilbert (1989, 1991) and to other dual- 
process models (Epstein, 1994; Hammond, 
1996; Sloman, 1996; see also Shweder, 
1977). We assume system 1 and system 2 
can be active concurrently, that automatic 
and controlled cognitive operations compete 
for the control of overt responses, and that 
deliberate judgments are likely to remain 
anchored on initial impressions. We also 
assume that the contribution of the two 
systems in determining stated judgments 
depends on both task features and individ- 
ual characteristics, including the time avail- 
able for deliberation (Finucane et al., 2000), 
mood (Bless et al., 1996; Isen, Nygren, & 
Ashby, 1988), intelligence (Stanovich & 
West, 2002), cognitive impulsiveness (Fred- 
erick, 2004), and exposure to statistical 
thinking (Agnoli, 1991; Agnoli & Krantz, 
1989; Nisbett et al., 1983). 


errors of intuitive judgment raise two 
questions: “What features of system 1 cre- 
ated the error?” and “Why was the error not 
detected and corrected by system 2?” (cf. 
Kahneman & Tversky, 1982). The first ques- 
tion is more basic, of course, but the second 
is also relevant and ought not be overlooked. 
Consider, for example, the paragraph that 
Tversky and Kahneman (1974; p. 3 in 
Kahneman, Slovic, & Tversky, 1982) used to 
introduced the notions of heuristic and bias: 


The subjective assessment of probability re- 
sembles the subjective assessment of physi- 
cal quantities such as distance or size. These 
judgments are all based on data of lim- 
ited validity, which are processed accord- 
ing to heuristic rules. For example, the ap- 
parent distance of an object is determined 
in part by its clarity. The more sharply 
the object is seen, the closer it appears to 
be. This rule has some validity, because in 
any given scene the more distant objects 
are seen less sharply than nearer objects. 
However, the reliance on this rule leads to 
systematic errors in the estimation of dis- 
tance. Specifically, distances are often over- 
estimated when visibility is poor because 
the contours of objects are blurred. On the 
other hand, distances are often underesti- 
mated when visibility is good because the 
objects are seen sharply. Thus the reliance 
on clarity as an indication leads to com- 
mon biases. Such biases are also found in 
intuitive judgments of probability. 


This statement was intended to extend 
Brunswik’s (1943) analysis of the percep- 
tion of distance to the domain of intuitive 
thinking and to provide a rationale for us- 
ing biases to diagnose heuristics. However, 
the analysis of the effect of haze is flawed: 
It neglects the fact that an observer looking 
at a distant mountain possesses two relevant 
cues, not one. The first cue is the blur of the 
contours of the target mountain, which is 
positively correlated with its distance, when 
all else is equal. This cue should be given 
positive weight in a judgment of distance, 
and it is. The second relevant cue, which 
the observer can readily assess by looking 
around, is the ambient or general haziness. 
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ing distance, general haziness is a suppressor 
variable, which must be weighted negatively 
because it contributes to blur but is uncor 
related with distance. Contrary to the argu- 
ment made in 1974, using blur as a cue does 
not inevitably lead to bias in the judgment 
of distance — the illusion could just as well 
be described as a failure to assign adequate 
negative weight to ambient haze. The effect 
of haziness on impressions of distance is a 
failing of system 1: The perceptual system is 
not designed to correct for this variable. The 
effect of haziness on judgments of distance 
is a separate failure of system 2. Although 
people are capable of consciously correcting 
their impressions of distance for the effects 
of ambient haze, they commonly fail to do 
so. A similar analysis applies to some of the 
judgmental biases we discuss later, in which 
errors and biases only occur when both sys- 
tems fail. 

In the following section, we present 
an attribute-substitution model of heuris- 
tic judgment, which assumes that difficult 
questions are often answered by substi- 
tuting an answer to an easier one. This 
elaborates and extends earlier treatments 
of the topic (Kahneman & Tversky, 1982; 
Tversky & Kahneman, 1974, 1983). Fol- 
lowing sections introduce a research design 
for studying attribute substitution, as well 
as discuss the controversy over the repre- 
sentativeness heuristic in the context of a 
dual-system view that we endorse. The final 
section situates representativeness within 
a broad family of prototype heuristics, in 
which properties of a prototypical exemplar 
dominate global judgments concerning an 
entire set. 


Attribute Substitution 


The early research on judgment heuris- 
tics was guided by a simple and general 
hypothesis: When confronted with a diff- 
cult question, people may answer an eas- 
ier one instead and are often unaware of 
the substitution. A person who is asked 
“What proportion of long-distance relation- 


as if she had been asked “Do instances of 
failed long-distance relationships come read- 
ily to mind?” This would be an applica- 
tion of the availability heuristic. A profes- 
sor who has heard a candidate’s job talk and 
now considers the question “How likely is it 
that this candidate could be tenured in our 
department?” may answer the much easier 
question: “How impressive was the talk?”. 
This would be an example of one form of 
the representativeness heuristic. 

The heuristics and biases research pro- 
gram has focused primarily on representa- 
tiveness and availability — two versatile at- 
tributes that are automatically computed 
and can serve as candidate answers to many 
different questions. It has also focused prin- 
cipally on thinking under uncertainty. How- 
ever, the restriction to particular heuristics 
and to a specific context is largely arbitrary. 
Kahneman and Frederick (2002) argued that 
this process of attribute substitution is a 
general feature of heuristic judgment; that 
whenever the aspect of the judgmental ob- 
ject that one intends to judge (the target at- 
tribute) is less readily assessed than a related 
property that yields a plausible answer (the 
heuristic attribute), individuals may unwit- 
tingly substitute the simpler assessment. For 
an example, consider the well-known study 
by Strack, Martin, and Schwarz (1988) in 
which college students answered a survey 
that included these two questions: “How 
happy are you with your life in general?” and 
“How many dates did you have last month?” 
The correlation between the two questions 
was negligible when they occurred in the 
order shown, but rose to .66 if the dating 
question was asked first. We suggest that the 
question about dating frequency automati- 
cally evokes an evaluation of one’s romantic 
satisfaction and that this evaluation lingers 
to become the heuristic attribute when the 
global happiness question is subsequently 
encountered. 

To further illustrate the process of at- 
tribute substitution, consider a question in 
a study by Frederick and Nelson (2004): 
“If a sphere were dropped into a open 
cube, such that it just fit (the diameter 
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width of the cube), what proportion of 
the volume of the cube would the sphere 
occupy?” The target attribute in this judg- 
ment (the volumetric relation between a 
cube and sphere) is simple enough to be un- 
derstood but complicated enough to accom- 
modate a wide range of estimates as plau- 
sible answers. Thus, if a relevant simpler 
computation or perceptual impression ex- 
ists, respondents will have no strong basis for 
rejecting it as their “final answer.” Frederick 
and Nelson (2004) proposed that the areal 
ratio of the respective cross-sections serves 
that function; that is, that respondents an- 
swer as if they were asked the simpler two- 
dimensional analog of this problem (“If a 
circle were drawn inside a square, what pro- 
portion of the area of the square does the 
circle occupy?”). As evidence, they noted 
that the mean estimate of the “sphere inside 
cube” problem (74%) is scarcely different 
from the mean estimate of the “circle inside 
square” problem (77%) and greatly exceeds 
the correct answer (52%) — a correct an- 
swer that most people, not surprisingly, are 
surprised by. 


Biases 


Whenever the heuristic attribute differs 
from the target attribute, the substitution 
of one for the other inevitably introduces 
systematic biases. In this treatment, we 
are mostly concerned with weighting bi- 
ases, which arise when cues available to 
the judge are given either too much or 
too little weight. Criteria for determining 
optimal weights can be drawn from sev- 
eral sources. In the classic lens model, the 
optimal weights associated with different 
cues are the regression weights that opti- 
mize the prediction of an external criterion, 
such as physical distance or the grade point 
average that a college applicant will attain 
(Brunswik, 1943; Hammond, 1955). Our 
analysis of weighting biases applies to such 
cases, but it also extends to attributes for 
which no objective criterion is available, 
such as an individual’s overall happiness 
or the probability that a particular patient 
will survive surgery. Normative standards for 


straints of ordinary language and are often 
imprecise. For example, the conventional in- 
terpretation of overall happiness does not 
specify how much weight ought to be given 
to various life domains. However, it certainly 
does require that substantial weight be given 
to every important domain of life and that 
no weight at all be given to the current 
weather or to the recent consumption of a 
cookie. Similar rules of common sense ap- 
ply to judgments of probability. For example, 
the statement “John is more likely to survive 
a week than a month” is clearly true, and, 
thus, implies a rule that people would want 
their probability judgments to follow. Ac- 
cordingly, neglect of duration in assessments 
of survival probabilities would be properly 
described as a weighting bias, even if there 
were no way to establish a normative prob- 
ability for individual cases (Kahneman & 
Tversky, 1996). 

For some judgmental tasks, information 
that could serve to supplement or correct the 
heuristic is not neglected or underweighted 
but simply lacking. If asked to judge the rela- 
tive frequency of words beginning with K or 
R (Tversky & Kahneman, 1973) or to com- 
pare the population of a familiar foreign city 
with one that is unfamiliar (Gigerenzer & 
Goldstein, 1996), respondents have little re- 
course but to base their judgments on ease 
of retrieval or recognition. The necessary re- 
liance on these heuristic attributes renders 
such judgments susceptible to biasing factors 
(e.g., the amount of media coverage). How- 
ever, unlike weighting biases, such biases of 
insufficient information cannot be described 
as errors of judgment because there is no way 
to avoid them. 


Accessibility and Substitution 


The intent to judge a target attribute initi- 
ates a search for a reasonable value. Some- 
times this search ends quickly because the 
required value can be read from a stored 
memory (e.g., the answer to the question 
“How tall are you?”) or a current experience 
(e.g., the answer to the question “How much 
do you like this cake?”). For other judg- 
ments, however, the target attribute does 
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for it evokes other attributes that are con- 
ceptually and associatively related. For ex- 
ample, a question about overall happiness 
may retrieve the answer to a related ques- 
tion about satisfaction with a particular as- 
pect of life upon which one is currently 
reflecting. 

We adopt the term accessibility to refer 
to the ease (or effort) with which particu- 
lar mental contents come to mind (see, e.g., 
Higgins, 1996; Tulving & Pearlstone, 1966). 
The question of why thoughts become ac- 
cessible — why particular ideas come to mind 
at particular times — has a long history in psy- 
chology and encompasses notions of stimu- 
lus salience, associative activation, selective 
attention, specific training, and priming. In 
the present usage, accessibility is determined 
jointly by the characteristics of the cogni- 
tive mechanisms that produce it and by the 
characteristics of the stimuli and events that 
evoke it, and it may refer to different aspects 
and elements of a situation, different ob- 
jects in a scene, or different attributes of an 
object. 

Attribute substitution occurs when a rela- 
tively inaccessible target attribute is assessed 
by mapping a relatively accessible and re- 
lated heuristic attribute onto the target scale. 
Some attributes are permanent candidates 
for the heuristic role because they are rou- 
tinely evaluated as part of perception and 
comprehension and therefore always acces- 
sible (Tversky & Kahneman, 1983). These 
natural assessments include physical prop- 
erties such as size and distance and more 
abstract properties such as similarity (e.g., 
Tversky & Kahneman, 1983; see Goldstone 
& Son, Chap. 2), cognitive fluency in per- 
ception and memory (e.g., Jacoby & Dallas, 
1991; Schwarz & Vaughn, 2002; Tversky & 
Kahneman, 1973), causal propensity (Hei- 
der, 1944; Kahneman & Varey, 1990; Mi- 
chotte, 1963), surprisingness (Kahneman & 
Miller, 1986), mood (Schwarz & Clore, 
1983), and affective valence (eg., Bargh, 
1997; Cacioppo, Priester, & Berntson, 1993; 
Kahneman, Ritov, & Schkade, 1999; Slovic 
et al., 2002; Zajonc, 1980, 1997). 

Because affective valence is a natural as- 
sessment, it is a candidate for attribute sub- 


judgments. Indeed, the evidence suggests 
that a list of major general-purpose heuris- 
tics should include an affect heuristic (Slovic 
et al., 2002). Slovic and colleagues (2002) 
show that a basic affective reaction gov- 
erns a wide variety of more complex evalua- 
tions such as the cost-benefit ratio of various 
technologies, the safe level of chemicals, or 
even the predicted economic performance 
of various industries. In the same vein, Kah- 
neman and Ritov (1994) and Kahneman, 
Ritov, and Schkade (1999) proposed that an 
automatic affective valuation is the principal 
determinant of willingness to pay for public 
goods, and Kahneman, Schkade, and Sun- 
stein (1998) interpreted jurors’ assessments 
of punitive awards as a mapping of outrage 
onto a dollar scale of punishments. 

Attributes that are not naturally assessed 
can become accessible if they have been re- 
cently evoked or primed (see, e.g., Bargh et 
al., 1986; Higgins & Brendl, 1995). The ef- 
fect of temporary accessibility is illustrated 
by the “romantic satisfaction heuristic” for 
judging happiness. The mechanism of at- 
tribute substitution is the same, however, 
whether the heuristic attribute is chronically 
or temporarily accessible. 

There is sometimes more than one can- 
didate for the role of heuristic attribute. For 
an example that we borrow from Anderson 
(1991), consider the question “Are more 
deaths caused by rattlesnakes or bees?” A re- 
spondent who has recently read about some- 
one who died from a snakebite or bee sting 
may use the relative availability of instances 
of the two categories as a heuristic. If no 
instances come to mind, that person might 
consult his or her impressions of the “dan- 
gerousness” of the typical snake or bee, an 
application of representativeness. Indeed, it 
is possible that the question initiates both 
a search for instances and an assessment of 
dangerousness, and that a contest of accessi- 
bility determines the role of the two heuris- 
tics in the final response. As Anderson ob- 
served, it is not always possible to determine 
a priori which heuristic will govern the re- 
sponse to a particular problem. 

The original list of heuristics (Tver- 
sky & Kahneman, 1974) also included an 
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however, does not involve the substitution of 
a heuristic attribute for a target attribute: It 
is due to the temporary salience of a particu- 
lar value of the target attribute. However, an- 
choring and attribute substitution are both 
instances of a broader family of accessibility 
effects (Kahneman, 2003). In attribute sub- 
stitution, a highly accessible attribute con- 
trols the evaluation of a less accessible one. 
In anchoring, a highly accessible value of 
the target attribute dominates its judgment. 
This conception is compatible with more 
recent theoretical treatments of anchor 
ing (see, e.g., Chapman & Johnson, 1994, 
2002; Mussweiler & Strack 1999; Strack & 
Mussweiler, 1997). 


Cross-Dimensional Mapping 


The process of attribute substitution in- 
volves the mapping of the heuristic at- 
tribute of the judgment object onto the 
scale of the target attribute. Our notion of 
cross-dimensional mapping extends Stevens’ 
(1975) concept of cross-modality matching. 
Stevens postulated that intensive attributes 
(eg., brightness, loudness, the severity of 
crimes) can be mapped onto a common scale 
of sensory strength, allowing direct matching 
of intensity across modalities — permitting, 
for example, respondents to match the loud- 
ness of sounds to the severity of crimes. Our 
conception allows other ways of compar- 
ing values across dimensions, such as match- 
ing relative positions (eg., percentiles) 
in the frequency distributions or ranges of 
different attributes (Parducci, 1965). An im- 
pression of a student’s position in the dis- 
tribution of aptitude may be mapped di- 
rectly onto a corresponding position in the 
distribution of academic achievement and 
then translated into a letter grade. Note 
that cross-dimensional matching is inher- 
ently nonregressive: A judgment or predic- 
tion is just as extreme as the impression 
mapped onto it. Ganzach and Krantz (1990) 
applied the term “univariate matching” to a 
closely related notion. 

Cross-dimensional mapping presents spe- 
cial problems when the scale of the tar 


man, Ritov, and Schkade (1999) discussed 
two situations in which an attitude (or af- 
fective valuation) is mapped onto an un- 
bounded scale of dollars: when respondents 
in surveys are required to indicate how much 
money they would contribute for a cause, 
and when jurors are required to specify an 
amount of punitive damages against a neg- 
ligent firm. The mapping of attitudes onto 
dollars is a variant of direct scaling in psy- 
chophysics, where respondents assign num- 
bers to indicate the intensity of sensations 
(Stevens, 1975). The normal practice of di- 
rect scaling is for the experimenter to pro- 
vide a modulus — a specified number that 
is to be associated with a standard stimu- 
lus. For example, respondents may be asked 
to assign the number 10 to the loudness of 
a standard sound and judge the loudness 
of other sounds relative to that standard. 
Stevens (1975) observed that when the ex- 
perimenter fails to provide a modulus, re- 
spondents spontaneously adopt one. How- 
ever, different respondents may pick moduli 
that differ greatly (sometimes varying by a 
factor of 100 or more); thus, the variability 
in judgments of particular stimuli is domi- 
nated by arbitrary individual differences in 
the choice of modulus. A similar analysis 
applies to situations in which respondents 
are required to use the dollar scale to ex- 
press affection for a species or outrage to- 
ward a defendant. Just as Stevens’ observers 
had no principled way to assign a number to 
amoderately loud sound, survey participants 
and jurors have no principled way to scale 
affection or outrage into dollars. The anal- 
ogy of scaling without a modulus has been 
used to explain the notorious variability of 
dollar responses in surveys of willingness to 
pay and in jury awards (Kahneman, Ritov, 
& Schkade, 1999; Kahneman, Schkade, & 
Sunstein, 1998). 


System 2: The Supervision of 
Intuitive Judgments 


Our model assumes that an intuitive judg- 
ment is expressed overtly only if it is 
endorsed by system 2. The Stroop task 
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servers who are instructed to report the color 
in which words are printed tend to stum- 
ble when the word is the name of another 
color (e.g., the word BLUE printed in green 
ink). The difficulty arises because the word is 
automatically read, and activates a response 
(“blue” in this case) that competes with the 
required response (“green”). Errors are rare 
in the Stroop test, indicating generally suc- 
cessful monitoring and control of the overt 
response, but the conflict produces delays 
and hesitations. The successful suppression 
of erroneous responses is effortful, and its 
efficacy is reduced by stress and distraction. 

Gilbert (1989) described a correction 
model in which initial impulses are often 
wrong and normally overridden. He argued 
that people initially believe whatever they 
are told (e.g., “Whitefish love grapes”) and 
that it takes some time and mental effort to 
“unbelieve” such dubious statements. Here 
again, cognitive load disrupts the control- 
ling operations of system 2, increasing the 
rate of errors and revealing aspects of intu- 
itive thinking that are normally suppressed. 
In an ingenious extension of this approach, 
Bodenhausen (1990) exploited natural tem- 
poral variability in alertness. He found that 
“morning people” were substantially more 
susceptible to a judgment bias (the conjunc- 
tion fallacy) in the evening and that “evening 
people” were more likely to commit the fal- 
lacy in the morning. 

Because system 2 is relatively slow, its op- 
erations can be disrupted by time pressure. 
Finucane et al. (2000) reported a study in 
which respondents judged the risks and ben- 
efits of various products and technologies 
(e.g., nuclear power, chemical plants, cellu- 
lar phones). When participants were forced 
to respond within 5 seconds, the correlations 
between their judgments of risks and their 
judgments of benefits were strongly nega- 
tive. The negative correlations were much 
weaker (although still pronounced) when re- 
spondents were given more time to ponder 
a response. When time is short, the same 
affective evaluation apparently serves as a 
heuristic attribute for assessments of both 
benefits and risks. Respondents can move 


more than 5 seconds to do so. As this exam- 
ple illustrates, judgment by heuristic often 
yields simplistic assessments, which system 2 
sometimes corrects by bringing additional 
considerations to bear. 

Attribute substitution can be prevented 
by alerting respondents to the possibility 
that their judgment could be contaminated 
by an irrelevant variable. For example, al- 
though sunny or rainy weather typically af- 
fects reports of well-being, Schwarz and 
Clore (1983) found that weather has no 
effect if respondents are asked about the 
weather just before answering the well- 
being question. Apparently, this question re- 
minds respondents that their current mood 
(a candidate heuristic attribute) is influ- 
enced by a factor (current weather) that is 
irrelevant to the requested target attribute 
(overall well-being). Schwarz (1996) also 
found that asking people to describe their 
satisfaction with some particular domain of 
life reduces the weight this domain receives 
in a subsequent judgment of overall well be- 
ing. As these examples illustrate, although 
priming typically increases the weight of that 
variable on judgment (a system 1 effect), this 
does not occur if the prime is a sufficiently 
explicit reminder that brings the self-critical 
operations of system 2 into play. 

We suspect that system 2 endorsements of 
intuitive judgments are granted quite casu- 
ally under normal circumstances. Consider 
the puzzle “A bat and a ball cost $1.10 in to- 
tal. The bat costs $1 more than the ball. How 
much does the ball cost?” Almost everyone 
we ask reports an initial tendency to answer 
“to cents” because the sum $1.10 separates 
naturally into $1 and 10 cents, and 10 cents 
is about the right magnitude. Many peo- 
ple yield to this immediate impulse. Even 
among undergraduates at elite institutions, 
about half get this problem wrong when it 
is included in a short IQ test (Frederick, 
2004). The critical feature of this problem 
is that anyone who reports 10 cents has ob- 
viously not taken the trouble to check his 
or her answer. The surprisingly high rate 
of errors in this easy problem illustrates 
how lightly system 2 monitors the output of 
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a plausible judgment that quickly comes to 
mind. (The correct answer, by the way, is 
5 cents.) 

The bat and ball problem elicits many er- 
rors, although it is not really difficult and 
certainly not ambiguous. A moral of this 
example is that people often make quick 
intuitive judgments to which they are not 
deeply committed. A related moral is that 
we should be suspicious of analyses that ex- 
plain apparent errors by attributing to re- 
spondents a bizarre interpretation of the 
question. Consider someone who answers a 
question about happiness by reporting her 
satisfaction with her romantic life. The re- 
spondent is surely not committed to the ab- 
surdly narrow interpretation of happiness 
that her response seemingly implies. More 
likely, at the time of answering, she thinks 
that she is reporting happiness: A judgment 
comes quickly to mind and is not obviously 
mistaken — end of story. Similarly, we pro- 
pose that respondents who judge probabil- 
ity by representativeness do not seriously be- 
lieve that the questions “How likely is X to 
be a Y?” and “How much does X resemble 
the stereotype of Y?” are synonymous. Peo- 
ple who make a casual intuitive judgment 
normally know little about how their judg- 
ment came about and know even less about 
its logical entailments. Attempts to recon- 
struct the meaning of intuitive judgments by 
interviewing respondents (see, e.g., Hertwig 
& Gigerenzer, 1999) are therefore unlikely 
to succeed because such probes require bet- 
ter introspective access and more coherent 
beliefs than people normally muster. 


Identifying a Heuristic 


Hypotheses about judgment heuristics have 
most often been studied by examining 
weighting biases and deviations from nor 
mative rules. However, the hypothesis that 
one attribute is substituted for another in a 
judgment task — for example, representative- 
ness for probability — can also be tested more 
directly. In the heuristic elicitation design, 


ments of a target attribute for a set of ob- 
jects and another group evaluates the hy- 
pothesized heuristic attribute for the same 
objects. The substitution hypothesis im- 
plies that the judgments of the two groups, 
when expressed in comparable units (e.g., 
percentiles), will be identical. This section 
examines several applications of heuristic 
elicitation. 


Eliciting Representativeness 


Figure 12.1 displays the results of two ex- 
periments in which a measure of represen- 
tativeness was elicited. These results were 
published long ago, but we repeat them here 
because they still provide the most direct 
evidence for both attribute substitution and 
the representativeness heuristic. For a more 
recent application of a similar design, see 
Bar-Hillel and Neter (1993). 

The object of judgment in the study from 
which Figure 12.1(a) is drawn (Kahneman & 
Tversky, 1973; p. 127 in Kahneman, Slovic, 
& Tversky, 1982) was the following descrip- 
tion of a fictitious graduate student, which 
was shown along with a list of nine fields of 
graduate specialization: 


Tom W. is of high intelligence, although 
lacking in true creativity. He has a need 
for order and clarity and for neat and tidy 
systems in which every detail finds its ap- 
propriate place. His writing is rather dull 
and mechanical, occasionally enlivened by 
somewhat corny puns and by flashes of 
imagination of the sci-fi type. He has a 
strong drive for competence. He seems to 
have little feel and little sympathy for other 
people and does not enjoy interacting with 
others. Self-centered, he nonetheless has a 
deep moral sense. 


Participants in a representativeness group 
ranked the nine fields of specialization by 
the degree to which Tom W. “resembles a 
typical graduate student.” Participants in the 
probability group ranked the nine fields ac- 
cording to the likelihood of Tom W.’s spe- 
cializing in each. Figure i2.1(a) plots the 
mean judgments of the two groups. The 
correlation between representativeness and 
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Figure 12.1. (a) Plot of average ranks for nine outcomes for Tom W. ranked by probability and by 
similarity to stereotypes of graduate students in various fields. (b) Plot of average ranks for eight 
outcomes for Linda ranked by probability and by representativeness. 


probability is nearly perfect (.97). No The logic of probabilistic prediction in 
stronger support for attribute-substitution — this task suggests that the ranking of out- 
could be imagined. However, interpreting | comes by their probabilities should be in- 
representativeness as the heuristic attribute | termediate between their rankings by rep- 


in these judgments does require two addi- _ resentativeness and by base rate frequencies. 
tional plausible assumptions—thatrepresen- _ Indeed, if the personality description is taken 
tativeness is more accessible than probabil- to be a poor source of information, proba- 
ity, and that there is no third attribute that bility judgments should stay quite close to 
could explain both judgments. the base rates. The description of Tom W. 
The Tom W. study was also intended to —_- was designed to allow considerable scope 
examine the effect of the base rates of out- —_ for judgments of probability to diverge from 
comes on categorical prediction. For that judgments of representativeness, as this logic 
purpose, respondents in a third group esti- requires. Figure 12.1 (a) shows no such di- 
mated the proportion of graduate students — vergence. Thus, the results of the Tom W. 
enrolled in each of the nine fields. By design, study simultaneously demonstrate the sub- 
some outcomes were defined quite broadly, stitution of representativeness for probabil- 
whereas others were defined more narrowly. __ ity and the neglect of known (but not explic- 
As intended, estimates of base rates var- __ itly mentioned) base rates. 
ied markedly across fields, ranging from 3% Figure 12.1 (b) is drawn from an early 
for Library Science to 20% for Humanities — study of the Linda problem, the best-known 
and Education. Also by design, the descrip- | and most controversial example in the rep- 
tion of Tom W. included characteristics (e.g.,  resentativeness literature (Tversky & Kahne- 


introversion) that were intended to make —_ man, 1982) in which a woman named Linda 
him fit the stereotypes of the smaller fields — was described as follows: 

(library science, computer science) better 
than the larger fields (humanities and social and very bright. She majored in philoso- 
sciences).' As intended, the correlation be- phy. As a student she was deeply concerned 
tween the average judgments of representa- with issues of discrimination and social jus- 
tiveness and of base rates was strongly nega- tice and also participated in antinuclear 
tive (—.65 ). demonstrations. 


Linda is 31 years old, single, outspoken 
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of respondents were asked to rank a set of 
eight outcomes by representativeness and 
probability. The results are shown in Fig- 
ure 12.1(b). Again the correlation between 
these rankings was almost perfect (.99).' 

Six of the eight outcomes that subjects 
were asked to rank were fillers (e.g., ele- 
mentary school teacher, psychiatric social 
worker). The two critical outcomes were #6 
(bank teller) and the so-called conjunction 
item #8 (bank teller and active in the fem- 
inist movement). Most subjects ranked the 
conjunction higher than its constituent, both 
in representativeness (85%) and probabil- 
ity (89%). The observed ranking of the two 
items is quite reasonable for judgments of 
similarity, but not for probability: Linda may 
resemble a feminist bank teller more than 
she resembles a bank teller, but she cannot 
be more likely to be a feminist bank teller 
than to be a bank teller. In this problem, re- 
liance on representativeness yields probabil- 
ity judgments that violate a basic logical rule. 
Asin the Tom W. study, the results make two 
points: They support the hypothesis of at- 
tribute substitution and also illustrate a pre- 
dictable judgment error. 


The Representativeness Controversy 


The experiments summarized in Figure 12.1 
provided direct evidence for the represen- 
tativeness heuristic and two concomitant 
biases: neglect of base rates and conjunc- 
tion errors. In the terminology introduced 
by Tversky and Kahneman (1983), the de- 
sign of these experiments was “subtle”: Ad- 
equate information was available for partic- 
ipants to avoid the error, but no effort was 
made to call their attention to that informa- 
tion. For example, participants in the Tom 
W. experiment had general knowledge of the 
relative base rates of the various fields of spe- 
cialization, but these base rates were not ex- 
plicitly mentioned in the problem. Similarly, 
both critical items in the Linda experiment 
were included in the list of outcomes, but 


¢9'#he were separated by a filler so respondents 


would not feel compelled to compare them. 
In the anthropomorphic language used here, 
system 2 was given a chance to correct the 
judgment but was not prompted to do so. 
In view of the confusing controversy that 
followed, it is perhaps unfortunate that the 
articles documenting base rate neglect and 
conjunction errors did not stop with subtle 
tests. Each article also contained an experi- 
mental flourish — a demonstration in which 
the error occurred in spite of a manipula- 
tion that called participants’ attention to the 
critical variable. The engineer—-lawyer prob- 
lem (Kahneman & Tversky, 1973) included 
special instructions to ensure that respon- 
dents would notice the base rates of the 
outcomes. The brief personality descriptions 
shown to respondents were reported to have 
been drawn from a set containing descrip- 
tions of 30 lawyers and 70 engineers (or vice 
versa), and respondents were asked “What 
is the probability that this description be- 
longs to one of the 30 lawyers in the sample 
of 100?” To the authors’ surprise, base rates 
were largely neglected in the responses, de- 
spite their salience in the instructions. Sim- 
ilarly, the authors were later shocked to dis- 
cover that more than 80% of undergraduates 
committed a conjunction error even when 
asked point blank whether Linda was more 
likely to be “a bank teller” or “a bank teller 
who is active in the feminist movement” 
(Tversky & Kahneman, 1983). The novelty 
of these additional direct or “transparent” 
tests was the finding that respondents con- 
tinued to show the biases associated with 
representativeness even in the presence of 
strong cues pointing to the normative re- 
sponse. The errors that people make in trans- 
parent judgment problems are analogous to 
observers’ failure to allow for ambient haze 
in estimating distances: A correct response 
is within reach, but not chosen, and the fail- 
ure involves an unexpected weakness of the 
corrective operations of system 2. 
Discussions of the heuristics and biases 
approach have focused almost exclusively 
on the direct conjunction fallacy and on 
the engineer-lawyer problems. These are 
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sively replicated with varying parameters. 
The amount of critical attention is remark- 
able because the studies were not, in fact, 
essential to the authors’ central claim. In 
terms of the present treatment, the claim 
was that intuitive prediction is an operation 
of system 1, which is susceptible to both base 
rate neglect and conjunction fallacies. There 
was no intent to deny the possibility of sys- 
tem 2 interventions that would modify or 
override intuitive predictions. Thus, the ar- 
ticles in which these studies appeared would 
have been substantially the same, although 
far less provocative, if respondents had over- 
come base rate neglect and conjunction er- 
rors in transparent tests. 

To appreciate why the strong forms of 
base rate neglect and of the conjunction fal- 
lacy sparked so much controversy, it is use- 
ful to distinguish two conceptions of human 
rationality (Kahneman, 2000b). Coherence 
rationality is the strict conception that re- 
quires the agent’s entire system of beliefs 
and preferences to be internally consistent 
and immune to effects of framing and con- 
text. For example, an individual’s probabil- 
ity p (“Linda is a bank teller’) should be the 
sum of the probabilities p (“Linda is a bank 
teller and a feminist”), and p (“Linda is a bank 
teller and not a feminist”). A subtle test of 
coherence rationality could be conducted by 
asking individuals to assess these three prob- 
abilities on separate occasions under circum- 
stances that minimize recall. Coherence can 
also be tested in a between-groups design. If 
random assignment is assumed, the sum of 
the average probabilities assigned to the two 
component events should equal the average 
judged probability of “Linda is a bank teller.” 
If this prediction fails, then at least some 
individuals are incoherent. Demonstrations 
of incoherence present a significant chal- 
lenge to important models of decision the- 
ory and economics, which attribute to agents 
a very strict form of rationality (Tversky & 
Kahneman, 1986). Failures of perfect coher- 
ence are less provocative to psychologists, 
who have a more realistic view of human 
capabilities. 


tionality, only requires an ability to reason 
correctly about the information currently 
at hand without demanding perfect consis- 
tency among beliefs that are not simulta- 
neously evoked. The best known violation 
of reasoning rationality is the famous “four 
card” problem (Wason, 1960). The failure of 
intelligent adults to reason their way through 
this problem is surprising because the prob- 
lem is “easy” in the sense of being easily 
understood once explained. What everyone 
learns, when first told that intelligent peo- 
ple fail to solve the four-card problem, is 
that one’s expectations about human rea- 
soning abilities had not been adequately cal- 
ibrated. There is, of course, no well-defined 
metric of reasoning rationality, but whatever 
metric one uses, the Wason problem calls 
for a downward adjustment. The surprising 
results of the Linda and engineer—lawyer 
problems led Tversky and Kahneman to a 
similar realization: The reasoning of their 
subjects was less proficient than they had an- 
ticipated. Many readers of the work shared 
this conclusion, but many others strongly 
resisted it. 

The implicit challenge to reasoning ra- 
tionality was met by numerous attempts to 
dismiss the findings of the engineer-lawyer 
and the Linda studies as artifacts of ambigu- 
ous language, confusing instructions, conver- 
sational norms, or inappropriate normative 
standards. Doubts have been raised about 
the proper interpretation of almost every 
word in the conjunction problem, including 
“bank teller,” “probability,” and even “and” 
(see, e.g., Dulany & Hilton, 1991; Hilton & 
Slugoski, 2001). These claims are not dis- 
cussed in detail here. We suspect that most 
of them have some validity and that they 
identified mechanisms that may have made 
the results in the engineer-lawyer and Linda 
studies exceptionally strong. However, we 
note a significant weakness shared by all 
these critical discussions: They provide no 
explanation of the essentially perfect con- 
sistency of the judgments observed in di- 
rect tests of the conjunction rule and in 
three other types of experiments: subtle 
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most important, judgments of representa- 
tiveness (see also Bar-Hillel & Neter, 1993). 
Interpretations of the conjunction fallacy 
as an artifact implicitly dismiss the results 
of Figure 12.1(b) as a coincidence (for an 
exception, see Ayton, 1998). The story of 
the engineer-lawyer problem is similar. Here 
again, multiple demonstrations in which 
base rate information was used (see Koehler, 
1996, for a review) invited the inference that 
there is no general problem of base rate ne- 
glect. Again, the data of prediction by repre- 
sentativeness in Figure 12.1(a) (and related 
results reported by Kahneman & Tversky, 
1973) were ignored. 

The demonstrations that under some con- 
ditions people avoid the conjunction fallacy 
in direct tests, or use explicit base rate in- 
formation, led some scholars to the blanket 
conclusion that judgment biases are artifi- 
cial and fragile and that there is no need for 
judgment heuristics to explain them. This 
position was promoted most vigorously by 
Gigerenzer (1991). Kahneman and Tversky 
(1996) argued in response that the heuris- 
tics and biases position does not preclude the 
possibility of people’s performing flawlessly 
in particular variants of the Linda and the 
engineer—-lawyer problems. Because laypeo- 
ple readily acknowledge the validity of 
the conjunction rule and the relevance of 
base rate information, the fact that they 
sometimes obey these principles is neither a 
surprise nor an argument against the role of 
representativeness in routine intuitive pre- 
diction. However, the study of conditions 
under which errors are avoided can help us 
understand the capabilities and limitations 
of system 2. We develop this argument fur- 
ther in the next section. 


Making Biases Disappear: A Task 
for System 2 


Much has been learned over the years about 
variables and experimental procedures that 
reduce or eliminate the biases associated 
with representativeness. We next discuss 
conditions under which errors of intuition 


stances under which intuitions may not be 
evoked at all. 


STATISTICAL SOPHISTICATION 


The performance of statistically sophisti- 
cated groups of respondents in different ver- 
sions of the Linda problem illustrates the ef- 
fects of both expertise and research design 
(Tversky & Kahneman, 1983). Statistical ex- 
pertise provided no advantage in the eight- 
item version in which the critical items were 
separated by a filler and were presumably 
considered separately. In the two-item ver- 
sion, in contrast, respondents were effec- 
tively compelled to compare “bank teller” 
with “bank teller and is active in the femi- 
nist movement.” The incidence of conjunc- 
tion errors remained essentially unchanged 
among the statistically naive in this condi- 
tion but dropped dramatically for the statis- 
tically sophisticated. Most of the experts fol- 
lowed logic rather than intuition when they 
recognized that one of the categories con- 
tained the other. In the absence of a prompt 
to compare the items, however, the statis- 
tically sophisticated made their predictions 
in the same way as everyone else does — by 
representativeness. As Stephen Jay Gould 
(1991, p. 469) noted, knowledge of the truth 
does not dislodge the feeling that Linda is a 
feminist bank teller: “I know [the right an- 
swer], yet a little homunculus in my head 
continues to jump up and down, shouting at 
me — ‘but she can’t just be a bank teller; read 
the description.”” 


INTELLIGENCE 


Stanovich (1999) and Stanovich and West 
(2002) observed a generally negative corre- 
lation between conventional measures of in- 
telligence and susceptibility to judgment bi- 
ases. They used transparent versions of the 
problems, which include adequate cues to 
the correct answer and therefore provide 
a test of reasoning rationality. Not surpris- 
ingly, intelligent people are more likely to 
possess the relevant logical rules and also to 
recognize the applicability of these rules in 
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the present analysis, high-IQ respondents 
benefit from relatively efficient system 2 op- 
erations that enable them to overcome er- 
roneous intuitions when adequate informa- 
tion is available. (However, when a problem 
is too difficult for everyone, the correlation 
may reverse because the more intelligent re- 
spondents are more likely to agree on a plau- 
sible error than to respond randomly, as dis- 
cussed in Kahneman, 2000b.) 


FREQUENCY FORMAT 


Relative frequencies (e.g., 1 in 10) are more 
vividly represented and more easily under- 
stood than equivalent probabilities (.10) or 
percentages (10%). For example, the emo- 
tional impact of statements of risk is en- 
hanced by the frequency format: “1 person 
in 1000 will die” is more frightening than a 
probability of .oo1 (Slovic et al., 2002). The 
frequency representation also makes it eas- 
ier to visualize partitions of sets and detect 
that one set is contained in another. As a 
consequence, the conjunction fallacy is gen- 
erally avoided in direct tests in which the 
frequency format makes it easy to recog- 
nize that feminist bank tellers are a subset of 
bank tellers (Gigerenzer & Hoffrage, 1995; 
Tversky & Kahneman, 1983). For similar rea- 
sons, some base rate problems are more eas- 
ily solved when couched in frequencies than 
in probabilities or percentages (Cosmides & 
Tooby, 1996). However, there is little sup- 
port for the more general claims about the 
evolutionary adaptation of the mind to deal 
with frequencies (Evans et al., 2000). Fur- 
thermore, the ranking of outcomes by pre- 
dicted relative frequency is very similar to 
the ranking of the same outcomes by rep- 
resentativeness (Mellers, Hertwig, & Kahne- 
man, 2001). We conclude that the frequency 
format affects the corrective operations of 
system 2, not the intuitive operations of sys- 
tem 1. The language of frequencies improves 
respondents’ ability to impose the logic of 
set inclusion on their considered judgments 
but does not reduce the role of representa- 
tiveness in their intuitions. 


The weight of neglected variables can be in- 
creased by drawing attention to them, and 
experimenters have devised many ingenious 
ways to do so. Schwarz et al. (1991) found 
that respondents pay more attention to base 
rate information when they are instructed 
to think as statisticians rather than clini- 
cal psychologists. Krosnick, Li, and Lehman 
(1990) exploited conversational conventions 
about the sequencing of information and 
confirmed that the impact of base rate in- 
formation was enhanced by presenting that 
information after the personality descrip- 
tion rather than before it. Attention to the 
base rate is also enhanced when partici- 
pants observe the drawing of descriptions 
from an urn (Gigerenzer, Hell, & Blank, 
1988) perhaps because watching the draw- 
ing induces conscious expectations that re- 
flect the known proportions of possible out- 
comes. The conjunction fallacy can also 
be reduced or eliminated by manipulations 
that increase the accessibility of the rel- 
evant rule, including some linguistic vari- 
ations (Macchi, 1995), and practice with 
logical problems (Agnoli, 1991; Agnoli & 
Krantz, 1989). 

The interpretation of these attentional ef- 
fects is straightforward. We assume most 
participants in judgment studies know, at 
least vaguely, that the base rate is rele- 
vant and that the conjunction rule is valid 
(Kahneman & Tversky, 1982). Whether they 
apply this knowledge to override an intu- 
itive judgment depends on their cognitive 
skills (education, intelligence) and on for 
mulations that make the applicability of a 
rule apparent (frequency format) or a rel- 
evant factor more salient (manipulations of 
attention). We assume intuitions are less sen- 
sitive to these factors and that the appear- 
ance or disappearance of biases mainly re- 
flects variations in the efficacy of corrective 
operations. This conclusion would be circu- 
lar, of course, if the corrective operations 
were both inferred from the observation of 
correct performance and used to explain that 
performance. Fortunately, the circularity can 
be avoided because the role of system 2 
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nipulations of time pressure, cognitive load, 
or mood to interfere with its operations. 


WITHIN-SUBJECTS FACTORIAL DESIGNS 


The relative virtues of between-subjects and 
within-subject designs in studies of judg- 
ment are a highly contentious issue. Facto- 
rial designs have their dismissive critics (eg., 
Poulton, 1989) and their vigorous defenders 
(e.g., Birnbaum, 1999). We do not attempt 
to adjudicate this controversy here. Our nar- 
rower point is that between-subjects designs 
are more appropriate for the study of heuris- 
tics of judgment. The following arguments 
favor this conclusion: 


* Factorial designs are transparent. Partici- 
pants are likely to identify the variables 
that are manipulated, especially if there 
are many trials and especially in a fully 
factorial design in which the same stimu- 
lus attributes are repeated in varying com- 
binations. The message that the design 
conveys to the participants is that the ex- 
perimenter expects to find effects of ev- 
ery factor that is manipulated (Bar-Hillel 
& Fischhoff, 1981; Schwarz, 1996). 

* Studies that apply a factorial design 
to judgment tasks commonly involve 
schematic and impoverished stimuli. The 
tasks are also highly repetitive. These 
features encourage participants to adopt 
simple mechanical rules that will allow 
them to respond quickly without forming 
an individuated impression of each stim- 
ulus. For example, Ordéfiez and Benson 
(1997) required respondents to judge the 
attractiveness of gambles on a 100-point 
scale. They found that under time pres- 
sure many respondents computed or esti- 
mated the expected values of the gambles 
and used the results as attractiveness rat- 
ings (e.g., a rating of 15 for a 52% chance 
to win $31.50). 

* Factorial designs often yield judgments 
that are linear combinations of the ma- 
nipulated variables. This is a central 
conclusion of a massive research effort 
conducted by Anderson (i996), who 


where they should multiply. 


In summary, the factorial design is not 
appropriate for testing hypotheses about bi- 
ases of neglect because it effectively guaran- 
tees that no manipulated factor is neglected. 
Figure 12.2 illustrates this claim by sev- 
eral examples of an additive extension effect 
that we discuss further in the next section. 
The experiments summarized in the differ- 
ent panels share three important features: 
(1) In each case, the quantitative variable 
plotted on the abscissa was completely ne- 
glected in similar experiments conducted in 
a between-subjects or subtle design; (2) in 
each case, the quantitative variable com- 
bines additively with other information; (3) 
in each case, a compelling normative ar- 
gument can be made for a quasimulti- 
plicative rule in which the lines shown in 
Figure 12.2 should fan out. For example, Fig- 
ure 12.2(c) presents a study of categorical 
prediction (Novemsky & Kronzon, 1999) in 
which respondent 5 judged the relative like- 
lihood that a person was a member of one 
occupation rather than another (e.g., com- 
puter programmer vs. flight attendant) on 
the basis of short personality sketches (e.g., 
“shy, serious, organized, and sarcastic”) and 
one of three specified base rates (10%, 50%, 
or 90%). Representativeness and base rate 
were varied factorially within subjects. The 
effect of base rate is clearly significant in this 
design (see also Birnbaum & Mellers, 1983). 
Furthermore, the effects of representative- 
ness and base rate are strictly additive. As 
Anderson (1996) argued, averaging (a spe- 
cial case of additive combination) is the most 
obvious way to combine the effects of two 
variables that are recognized as relevant (e.g., 
“She looks like a bank teller, but the base-rate 
is low.”). Additivity is not normatively ap- 
propriate in this case — any Bayes-like com- 
bination would produce curves that initially 
fan out from the origin and converge again 
at high values. Similar considerations apply 
to the other three panels of Figure 12.2 dis- 
cussed later. Between-subjects and factorial 
designs often yield different results in stud- 
ies of intuitive judgment. Why should we 
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Figure 12.2. (a) Willingness to pay to restore damage to species that differ in popularity as a function 
of the damage they have suffered (from Kahneman, Ritov, & Schkade 2000); (b) global evaluations of 
aversive sounds of different loudness as a function of duration for subjects selected for their high 
sensitivity to duration (from Schreiber & Kahneman, 2000); (c) ratings of probability for predictions 
that differ in representativeness as a function of base rate frequency (from Novemsky & Kronzon, 
1999); (d) global evaluations of episodes of painful pressure that differ in temporal profile as a 


function of duration (Ariely, 1998). 


believe one design rather than the other? 
The main argument against the factorial de- 
sign is its poor ecological validity. Encounter 
ing multiple judgment objects in rapid suc- 
cession in a rigidly controlled structure is 
unique to the laboratory, and the solutions 
that they evoke are not likely to be typical. 
Direct comparisons among concepts that 
differ in only one variable — such as bank 
teller and feminist bank tellers — also provide 
a powerful hint and a highly unusual oppor- 
tunity to overcome intuitions. The between- 
subjects design, in contrast, mimics the hap- 
hazard encounters in which most judgments 


are made and is more likely to evoke the ca- 
sually intuitive mode of judgment that gov- 
erns much of mental life in routine situations 


(e.g., Langer, 1978). 


Prototype Heuristics and the Neglect 
of Extension 


In this section, we offer a common account 
of three superficially dissimilar judgmental 
tasks: (1) categorical prediction (e.g., “In a 
set of 30 lawyers and 70 engineers, what is the 
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ing, talkative, clever, aye cynical’ is one af the 
lawyers?”); (2) summary evaluations of past 
events (e.g., “Overall, how aversive was it to 
be exposed for 30 minutes to your neighbor's 
car alarm?”); and (3) economic valuations 
of public goods (e.g., “What is the most you 
would be willing to pay to prevent 200,000 mi- 
grating birds from drowning in uncovered oil 
ponds?”). We propose that a generalization 
of the representativeness heuristic accounts 
for the remarkably similar biases that are ob- 
served in these diverse tasks. 

The original analysis of categorical pre- 
diction by representativeness (Kahneman & 
Tversky 1973; Tversky & Kahneman, 1983) 
invoked two assumptions in which the word 
“representative” was used in different ways: 
(a) A prototype (a representative exemplar) 
is used to represent categories (e.g., bank 
tellers) in the prediction task, and (2) the 
probability that the individual belongs to a 
category is judged by the degree to which the 
individual resembles (is representative of) the 
category stereotype. Thus, categorical pre- 
diction by representativeness involves two 
separate acts of substitution — the substitu- 
tion of a representative exemplar for a cat- 
egory and the substitution of the heuris- 
tic attribute of representativeness for the 
target attribute of probability. Perhaps be- 
cause they share a label, the two pro- 
cesses have not been distinguished in dis- 
cussions of the representativeness heuristic. 
We separate them here by describing proto- 
type heuristics in which a prototype is sub- 
stituted for its category, but in which repre- 
sentativeness is not necessarily the heuristic 
attribute. 

The target attributes to which prototype 
heuristics are applied are extensional. An ex- 
tensional attribute pertains to an aggregated 
property of a set or category for which an 
extension is specified — the probability that 
a set of 30 lawyers includes Jack, the over- 
all unpleasantness of a set of moments of 
hearing a neighbor’s car alarm, and the per 
sonal dollar value of saving a certain number 
of birds from drowning in oil ponds. Nor 
mative judgments of extensional attributes 
are governed by a general principle of con- 
ditional adding, which dictates that each el- 


ey@eaeat of the set adds to the overall judg- 


ment an amount that depends on the el- 
ements already included. In simple cases, 
conditional adding is just regular adding — 
the total weight of a collection of chairs is 
the sum of their individual weights. In other 
cases, each element of the set contributes 
to the overall judgment, but the combina- 
tion rule is not simple addition and is most 
typically subadditive. For example, the eco- 
nomic value of protecting X birds should be 
increasing in X, but the value of saving 2000 
birds is for most people less than twice as 
large as the value of saving 1000 birds. 

The logic of categorical prediction entails 
that the probability of membership in a cat- 
egory should vary with its relative size, or 
base rate. In prediction by representative- 
ness, however, the representation of out- 
comes by prototypical exemplars effectively 
discards base rates because the prototype of a 
category (e.g., lawyers) contains no informa- 
tion about the size of its membership. Next, 
we show that phenomena analogous to the 
neglect of base rate are observed in other 
prototype heuristics: The monetary value at- 
tached to a public good is often insensitive 
to its scope, and the global evaluation of a 
temporally extended experience is often in- 
sensitive to its duration. These various in- 
stantiations of extension neglect (neglect of 
base rates, scope, and duration) have been 
discussed in separate literatures, but all can 
be explained by the two-part process that 
defines prototype heuristics: (1) A category 
is represented by a prototypical exemplar, 
and (2) a (nonextensional) property of the 
prototype is then used as a heuristic attribute 
to evaluate an extensional target attribute of 
the category. As might be expected from the 
earlier discussion of base rate neglect, exten- 
sion neglect in all its forms is most likely to be 
observed in between-subjects experiments. 
Within-subject factorial designs consistently 
yield the additive extension effect illustrated 
in Figure 12.2. 


Scope Neglect in Willingness to Pay 


The contingent valuation method (CVM) 
was developed by resource economists (see 
Mitchell & Carson, 1989) as a tool for 
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poses of litigation or cost-benefit analysis. 
Participants in contingent valuation (CV) 
surveys are asked to indicate their willing- 
ness to pay (WTP) for specified public goods, 
and their responses are used to estimate the 
total amount that the community would pay 
to obtain these goods. The economists who 
design contingent valuation surveys inter- 
pret WTP as a valid measure of economic 
value and assume that statements of WTP 
conform to the extensional logic of con- 
sumer theory. The relevant logic has been 
described by a critic of CVM (Diamond, 
1996), who illustrates the conditional adding 
rule by the following example: In the ab- 
sence of income effects, WTP for saving X 
birds should equal WTP for saving (X — k) 
birds, plus WTP to save k birds, where the 
last value is contingent on the costless prior 
provision of safety for (X — k) birds. 

Strict adherence to Bayes’ rule may be 
an excessively demanding standard for intu- 
itive predictions; similarly, it would be too 
much to ask for WTP responses that strictly 
conform to the “add-up rule.” In both cases, 
however, it seems reasonable to expect some 
sensitivity to extension — to the base rate 
of outcomes in categorical prediction and to 
the scope of the good in WTP. In fact, several 
studies have documented nearly complete 
neglect of scope in CV surveys. The best- 
known demonstration of scope neglect is an 
experiment by Desvouges et al. (1993), who 
used the scenario of migratory birds that 
drown in oil ponds. The number of birds said 
to die each year was varied across groups. 
The WTP responses were completely insen- 
sitive to this variable; the mean WTPs for 
saving 2000, 20,000, or 200,000 birds were 
$80, $78, and $88, respectively. 

A straightforward interpretation of this 
result involves the two acts of substitution 
that characterize prototype heuristics. The 
deaths of numerous birds are first repre- 
sented by a prototypical instance — perhaps 
an image of a bird soaked in oil and drown- 
ing. The prototype automatically evokes 
an affective response, and the intensity of 
that emotion is then mapped onto the dol- 
lar scale — substituting the readily accessi- 
ble heuristic attribute of affective intensity 


economic value. Other examples of radical 
insensitivity to scope lend themselves to a 
similar interpretation. Among others, Kah- 
neman (1986) found that Toronto residents 
were willing to pay almost as much to clean 
up polluted lakes in a small region of On- 
tario as to clean up all the polluted lakes in 
Ontario, and McFadden and Leonard (1993) 
reported that residents in four western states 
were willing to pay only 28% more to protect 
57 wilderness areas than to protect a single 
area (for more discussion of scope insensitiv- 
ity, see Frederick & Fischhoff, 1998). 

The similarity between WTP statements 
and categorical predictions is not limited 
to such demonstrations of almost complete 
extension neglect. The two responses also 
yield similar results when extension and 
prototype information are varied factori- 
ally within subjects. Figure 12.2(a) shows 
the results of a study of WTP for pro- 
grams that prevented different levels of 
damage to species of varying popularity 
(Ritov & Kahneman, unpublished observa- 
tions, cited in Kahneman, Ritov, & Schkade, 
1999). As in the case of base rate [Figure 
12.2(c)], extensional information (levels of 
damage) combines additively with nonex- 
tensional information. This rule of combina- 
tion is unreasonable; in any plausible theory 
of value, the lines would fan out. 

Finally, the role of the emotion evoked 
by a prototypical instance was also exam- 
ined directly in the same experiment, us- 
ing the heuristic elicitation paradigm intro- 
duced earlier: Some respondents were asked 
to imagine that they saw a television pro- 
gram documenting the effect of adverse eco- 
logical circumstances on individual mem- 
bers of different species. The respondents 
indicated, for each species, how much con- 
cern they expected to feel while watching 
such a documentary. The correlation be- 
tween this measure of affect and willingness 
to pay, computed across species, was -O7- 


Duration Neglect in the Evaluation 
of Experiences 


We next discuss experimental studies of the 
global evaluation of experiences that extend 
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horrific film clip (Fredrickson & Kahne- 
man, 1993), a prolonged unpleasant noise 
(Schreiber & Kahneman, 2000), pressure 
from a vise (Ariely, 1998), or a painful med- 
ical procedure (Redelmeier & Kahneman, 
1996). Participants in these studies provided 
a continuous or intermittent report of hedo- 
nic or affective state, using a designated scale 
of momentary affect (Figure 12.3). When 
the episode had ended, they indicated a 
global evaluation of “the total pain or dis- 
comfort” associated with the entire episode. 

We first examine the normative rules that 
apply to this task. The global evaluation of 
a temporally extended outcome is an exten- 
sional attribute, which is governed by a dis- 
tinctive logic. The most obvious rule is tem- 
poral monotonicity: There is a compelling 
intuition that adding an extra period of pain 
to an episode of discomfort can only make 
it worse overall. Thus, there are two ways 
of making a bad episode worse — making 
the discomfort more intense or prolonging 
it. It must therefore be possible to trade off 
intensity against duration. Formal analyses 
have identified conditions under which the 
total utility of an episode is equal to the 
temporal integral of a suitably transformed 
measure of the instantaneous utility associ- 
ated with each moment (Kahneman, 2000; 
Kahneman, Wakker, & Sarin, 1997). 

Next, we turn to the psychology. 
Fredrickson and Kahneman (1993) proposed 
a “snapshot model” for the retrospective 
evaluation of episodes, which again involves 
two acts of substitution: First, the episode is 
represented by a prototypical moment; next, 
the affective value attached to the represen- 
tative moment is substituted for the exten- 
sional target attribute of global evaluation. 
The snapshot model was tested in an exper- 
iment in which participants provided con- 
tinuous ratings of their affect while watch- 
ing plotless films that varied in duration and 
affective value (e.g., fish swimming in coral 
reefs, pigs being beaten to death with clubs), 
and later reported global evaluations of their 
experiences. The central finding was that the 
retrospective evaluations of these observers 
were predicted with substantial accuracy by 


during a film and the end affect reported as 
the film was about to end. This has been 
called the peak/end rule. However, the cor- 
relation between retrospective evaluations 
and the duration of the films was negligible — 
a finding that Fredrickson and Kahneman la- 
beled duration neglect. The resemblance of 
duration neglect to the neglect of scope and 
base rate is striking and unlikely to be ac- 
cidental. In this analysis, all three are mani- 
festations of extension neglect caused by the 
use of a prototype heuristic. 

The peak/end rule and duration neglect 
have both been confirmed on multiple oc- 
casions. Figure 12.3 presents raw data from 
a study reported by Redelmeier and Kahne- 
man (1996), in which patients undergoing 
colonoscopy reported their current level of 
pain every 60 seconds throughout the proce- 
dure. Here again, an average of peak and end 
pain quite accurately predicted subsequent 
global evaluations and choices. The duration 
of the procedure varied considerably among 
patients (from 4 to 69 minutes), but these 
differences were not reflected in subsequent 
global evaluations in accord with duration 
neglect. The implications of these psycho- 
logical rules of evaluation are paradoxical. In 
Figure 12.3, for example, it appears evident 
that patient B had a worse colonoscopy than 
patient A (on the assumption they used the 
scale similarly). However, it is also appar- 
ent that the peak/end average was worse 
for patient A, whose procedure ended at 
a moment of relatively intense pain. The 
peak/end rule prediction for these two pro- 
files is that patient A would evaluate the 
procedure more negatively than patient B 
and would be more likely to prefer to un- 
dergo a barium enema rather than a repeat 
colonoscopy. The prediction was correct for 
these two individuals and confirmed by the 
data of a large group of patients. 

The effects of substantial variations of du- 
ration remained small (although statistically 
robust) even in studies conducted in a fac- 
torial design. Figure 12.2(d) is drawn from a 
study of responses to ischemic pain (Ariely, 
1998), in which duration varied by a factor of 
4. The peak/end average accounted for 98% 
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Figure 12.3. Pain intensity reported by two 
colonoscopy patients. 


of the systematic variance of global evalua- 
tions in that study and for 88% of the vari- 
ance in a similar factorial study of responses 
to loud unpleasant sounds [Schreiber & Kah- 
neman, 2000, Figure 12.2(b)]. Contrary to 
the normative standard for an extensional at- 
tribute, the effects of duration and of other 
determinants of evaluation were additive 
[Figures 12.2(b) and 12.2(d)]. 

The participants in these studies were 
well aware of the relative duration of their 
experiences and did not consciously de- 
cide to ignore duration in their evalua- 
tions. As Fredrickson and Kahneman (1993) 
noted, duration neglect is an attentional 
phenomenon: 


... duration neglect does not imply 
that duration information is lost, nor 
that people believe that duration is 


duration and consider it important in the 
abstract [but] what comes most readily to 
mind in evaluating episodes are the salient 
moments of those episodes and the affect 
associated with those moments. Duration 
neglect might be overcome, we suppose, by 
drawing attention more explicitly to the 
attribute of time. (p. 54) 


This comment applies equally well to 
other instances of extension neglect: The ne- 
glect of base rate in categorical prediction, 
the neglect of scope in willingness to pay, the 
neglect of sample size in evaluations of ev- 
idence (Griffin & Tversky, 1992; Tversky & 
Kahneman, 1971), and the neglect of prob- 
ability of success in evaluating a program of 
species preservation (DeKay & McClelland, 
1995). More generally, inattention plays a 
similar role in any situation in which the in- 
tuitive judgments generated by system 1 vio- 
late rules that would be accepted as valid by 
the more deliberate reasoning that we asso- 
ciate with system 2. As we noted earlier, the 
responsibility for these judgmental mishaps 
is properly shared by the two systems: Sys- 
tem 1 produces the initial error, and system 
2 fails to correct it, although it could. 


Violations of Dominance 


The conjunction fallacy observed in the 
Linda problem is an example of a domi- 
nance violation in judgment: Linda must be 
at least as likely to be a bank teller as to 
be a feminist bank teller, but people be- 
lieve the opposite. Insensitivity to extension 
(in this case, base rate) effectively guaran- 
tees the existence of such dominance viola- 
tions. For another illustration, consider the 
question: “How many murders were there 
last year in [Detroit/Michigan]?” Although 
there cannot be more murders in Detroit 
than in Michigan, because Michigan con- 
tains Detroit, the word “Detroit” evokes a 
more violent image than the word “Michi- 
gan” (except of course for people who im- 
mediately think of Detroit when Michigan 
is mentioned). If people use an impres- 
sion of violence as a heuristic and neglect 
geographic extension, their estimates of 


286 THE CAMBRIDGE HANDBOOK OF THINKING AND REASONING 


murders in tReddepaawboydatasHiehlianargiiny pain. In a replication, Schreiber and 


mates for the state. In a large sample of Uni- 
versity of Arizona students, this hypothesis 
was confirmed — the median estimate of the 
number of murders was 200 for Detroit and 
100 for Michigan. 

Violations of dominance akin to the con- 
junction fallacy have been observed in sev- 
eral other experiments involving both indi- 
rect (between-subjects) and direct tests. In a 
clinical experiment reported by Redelmeier, 
Katz, and Kahneman (2001), half of a large 
group of patients (N = 682 ) undergoing a 
colonoscopy were randomly assigned to a 
condition that made the actual experience 
strictly worse. Unbeknownst to the patient, 
the physician deliberately delayed the re- 
moval of the colonoscope for approximately 
1 minute beyond the normal time. The in- 
strument was not moved during the extra pe- 
riod. For many patients, the mild discomfort 
of the added period was an improvement 
relative to the pain than they had just ex- 
perienced. For these patients, of course, pro- 
longing the procedure reduced the peak/end 
average of discomfort. As expected, retro- 
spective evaluations were less negative in 
the experimental group, and a 5-year follow- 
up showed that participants in that group 
were also somewhat more likely to comply 
with recommendations to undergo a repeat 
colonoscopy (Redelmeier, Katz, & Kahne- 
man, 2001). 

In an experiment that is directly analo- 
gous to the demonstrations of the conjunc- 
tion fallacy, Kahneman et al. (1993) exposed 
participants to two cold-pressor experiences, 
one with each hand: a “short” episode (im- 
mersion of one hand in 14°C water for 
60 seconds), and a “long” episode (the short 
episode, plus an additional 30 seconds during 
which the water was gradually warmed to 
15 °C). The participants indicated the inten- 
sity of their pain throughout the experience. 
When they were later asked which of the 
two experiences they preferred to repeat, 
a substantial majority chose the long trial. 
These choices violate dominance, because 
after 60 seconds in cold water anyone will 
prefer the immediate experience of a warm 
towel to 30 extra seconds of slowly dimin- 


Kahneman (2000, experiment 2) exposed 
participants to pairs of unpleasant noises in 
immediate succession. The participants lis- 
tened to both sounds and chose one to be re- 
peated at the end of the session. The “short” 
noise lasted 8 seconds at 77 db. The “long” 
noise consisted of the short noise plus an 
extra period (of up to 24 seconds) at 66 db 
(less aversive, but still unpleasant and cer- 
tainly worse than silence). Here again, the 
longer noise was preferred most of the time, 
and this unlikely preference persisted over a 
series of five choices. 

The violations of dominance in these di- 
rect tests are particularly surprising because 
the situation is completely transparent. The 
participants in the experiments could eas- 
ily retrieve the durations of the two experi- 
ences between which they had to choose, 
but the results suggest that they simply 
ignored duration. A simple explanation is 
that the results reflect “choosing by liking” 
(see Frederick, 2002). The participants in 
the experiments simply followed the nor- 
mal strategy of choice: “When choosing be- 
tween two familiar options, consult your ret- 
rospective evaluations and choose the one 
that you like most (or dislike least).” Lik- 
ing and disliking are products of system 1, 
which do not conform to the rules of ex- 
tensional logic. System 2 could have inter- 
vened, but in these experiments it generally 
did not. Kahneman et al. (1993) described a 
participant in their study, who chose to re- 
peat the long cold-pressor experience. Soon 
after the choice was recorded, the partic- 
ipant was asked which of the two expe- 
riences was longer. As he correctly identi- 
fied the long trial, the participant was heard 
to mutter “the choice I made doesn’t seem 
to make much sense.” Choosing by liking 
is a form of mindlessness (Langer, 1978), 
which illustrates the casual governance of 
system 2. 

Like the conjunction fallacy in direct 
tests, which we discussed earlier, violations 
of temporal monotonicity in choices should 
be viewed as an expendable flourish. Be- 
cause the two aversive experiences occurred 
within a few minutes of each other and 


A MODEL OF HEURISTIC JUDGMENT 2 87 


responddntecanthie day tattpesdMitiivnaagocormthers (see also Kahneman, Ritov, & 


tion of the two events, system 2 had enough 
information to override choosing by liking. 
Its failure to do so is analogous to the fail- 
ures observed in direct tests of the Linda 
problem. In both cases, the violations of 
dominance tell us nothing new about sys- 
tem 1; they only illustrate an unexpected 
weakness of system 2. Just as the theory of 
intuitive categorical prediction would have 
remained intact if the conjunction fallacy 
had not “worked” in a direct test, the model 
of evaluation by moments would have sur- 
vived even if violations of dominance had 
been eliminated in highly transparent situa- 
tions. The same methodological issues arise 
in both contexts. Between-subjects experi- 
ments or subtle tests are most appropriate 
for studying the basic intuitive evaluations of 
system 1, and also most likely to reveal com- 
plete extension neglect. Factorial designs in 
which extension is manipulated practically 
guarantee an effect of this variable, and al- 
most guarantee that it will be additive, as 
in Figures 12.2(b) and 12.2(d) (Ariely, 1998; 
Ariely, Kahneman, & Loewenstein, 2000; 
Schreiber & Kahneman, 2000). Finally, al- 
though direct choices sometimes yield sys- 
tematic violations of dominance, these vio- 
lations can be avoided by manipulations that 
prompt system 2 to take control. 

In our view, the similarity of the re- 
sults obtained in diverse contexts is a com- 
pelling argument for a unified interpreta- 
tion, and a significant challenge to critiques 
that pertain only to selected subsets of this 
body of evidence. A number of commenta- 
tors have offered competing interpretations 
of base rate neglect (Cosmides & Tooby, 
1996; Koehler, 1996), insensitivity to scope 
in WTP (Kopp, 1992), and duration ne- 
glect (Ariely & Loewenstein, 2000). How- 
ever, these interpretations are generally spe- 
cific to a particular task and would not carry 
over to analogous findings in other domains. 
Similarly, the various attempts to explain the 
conjunction fallacy as an artifact do not ex- 
plain analogous violations of dominance in 
the cold-pressor experiment. The account 
we have offered is, in contrast, equally ap- 
plicable to all three contexts and possibly 


Schkade, 1999). We attribute extension ne- 
glect and violations of dominance to a lazy 
system 2, and to a prototype heuristic that 
combines two processes of system 1: the rep- 
resentation of categories by prototypes and 
the substitution of a nonextensional heuris- 
tic attribute for an extensional target at- 
tribute. We also propose that people have 
some appreciation of the role of extension 
in the various judgment tasks. Consequently, 
they will incorporate extension in their judg- 
ments when their attention is drawn to this 
factor — most reliably in factorial experi- 
ments, and sometimes (although not always) 
in direct tests. The challenge for compet- 
ing interpretations is to provide a unified ac- 
count of the diverse phenomena that have 
been considered in this section. 


Conclusions and Future Directions 


The original goal of the heuristics and biases 
program was to understand intuitive judg- 
ment under uncertainty. Heuristics were de- 
scribed as a collection of disparate cognitive 
procedures, related only by their common 
function in a particular judgmental domain — 
choice under uncertainty. It now appears, 
however, that judgment heuristics are ap- 
plied in a wide variety of domains and share 
a common process of attribute substitution, 
in which difficult judgments are made by 
substituting conceptually or semantically re- 
lated assessments that are simpler and more 
readily accessible. 

The current treatment explicitly ad- 
dresses the conditions under which intu- 
itive judgments are modified or overridden. 
Although attribute substitution provides an 
initial input into many judgments, it need 
not be the sole basis for them. Initial impres- 
sions are often supplemented, moderated, or 
overridden by other considerations, includ- 
ing the recognition of relevant logical rules 
and the deliberate execution of learned al- 
gorithms. The role of these supplemental or 
alternative inputs depends on characteristics 
of the judge and the judgment task. 
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does not entail a belief that every mental 
operation (including each postulated heuris- 
tic) can be definitively assigned to one sys- 
tem or the other. The placement of di- 
viding lines between “systems” is arbitrary 
because the bases by which we characterize 
mental operations (difficulty of acquisition, 
accessibility to introspection, and disrupt- 
ability) are all continua. However, this does 
not make distinctions less meaningful; there 
is broad agreement that mental operations 
range from rapid, automatic, perception-like 
impressions to deliberate computations that 
apply explicit rules or external aids. 

Many have questioned the usefulness 
of the notion of heuristics and biases by 
pointing to inconsistencies in the degree to 
which illusions are manifested across differ- 
ent studies. However, there is no mystery 
here to explain. Experimental studies of “the 
same” cognitive illusions can yield different 
results for two reasons: (1) because of vari- 
ation in factors that determine the accessi- 
bility of the intuitive illusion, and (2) be- 
cause they vary in factors that determine the 
accessibility of the corrective thoughts that 
are associated with system 2. Both types of 
variation can often be anticipated because 
of the vast amount of psychological knowl- 
edge that has accumulated about the differ- 
ent sets of factors that determine the ease 
with which thoughts come to mind — from 
principles of grouping in perception to prin- 
ciples that govern transfer of training in rule 
learning (Kahneman, 2003). Experimental 
surprises will occur, of course, and should 
lead to refinements in the understanding of 
the rules of accessibility. 

The argument that system 1 will be ex- 
pressed unless it is overridden by system 2 
sounds circular, but it is not, because empir- 
ical criteria can be used to test whether a par- 
ticular characterization of the two systems is 
accurate. For example, a feature of the situ- 
ation will be associated with system 2 if it is 
shown to influence judgments only when at- 
tention is explicitly directed to it (through, 
say, a within-subjects design). In contrast, a 
variable will be associated with system 1 if it 
can be shown to influence even those judg- 


one need not be committed, a priori, to as- 
signing a process to a particular system; the 
data will dictate the best characterization. 

The two-system model is a framework 
that combines a set of empirical generaliza- 
tions about cognitive operations with a set 
of tests for diagnosing the types of cognitive 
operations that underlie judgments in spe- 
cific situations. The generalizations and the 
specific predictions are testable and can be 
recognized as true or false. The framework 
itself will be judged by its usefulness as a 
heuristic for research. 


Acknowledgments 


This chapter is a modified version of a 
chapter by Kahneman and Frederick (2002). 
Preparation of this chapter was supported by 
grant SES-0213481 from the National Sci- 
ence Foundation. 


Note 


1. The entries plotted in Figure 12.1 are averages 
of multiple judgments, and the correlations are 
computed over a set of judgment objects. It 
should be noted that correlations between av- 
erages are generally much higher than corre- 
sponding correlations within the data of indi- 
vidual respondents (Nickerson, 1995). Indeed, 
group results may even be unrepresentative if 
they are dominated by a few individuals who 
produce more variance than others and have 
an atypical pattern of responses. Fortunately, 
this particular hypothesis is not applicable to 
the experiments of Figure 12.1, in which all re- 
sponses were ranks. 
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CHAPTER 13 


Motivated Thinking 


Daniel C. Molden 
E. Tory Higgins 


At one time or another, every one of us has 
engaged in “wishful thinking,” or “let our 
hearts influence our heads.” That is, every 
one of us has felt the effects of our motiva- 
tions on our thought processes. Given this 
common everyday experience, it is not sur- 
prising that an essential part of early psy- 
chological research was the idea that drives, 
needs, desires, motives, and goals can pro- 
foundly influence judgment and reasoning. 
More surprising is that motivational vari- 
ables play only a small role in current the- 
ories of reasoning. Why might this be? 

One possible explanation is that since the 
cognitive revolution in the 1960s and 1970s, 
researchers studying motivational and cogni- 
tive processes have been speaking somewhat 
different languages. That is, there has been 
a general failure to connect traditional moti- 
vational concepts, such as drives or motives, 
to information processing concepts, such as 
expectancies or spreading activation, which 
form the foundation for nearly all contem- 
porary research on thinking and reasoning. 
For a time, this led not only to misunder- 
standing but also to conflict between moti- 
vational and cognitive perspectives on judg- 


ment. More recently however, there has 
been a sharp increase in attempts to achieve 
a marriage between these two viewpoints in 
a wide variety of research areas. The primary 
objective of this chapter is to review these 
attempts and to demonstrate how it is not 
only possible, but also desirable, to reintro- 
duce motivational approaches to the study 
of basic thought processes. We begin by pro- 
viding some historical background on such 
approaches. 


A Brief History of Motivated Thinking 


Motivational perspectives on thought and 
reasoning originated most prominently with 
Freud’s (1905) clinical theorizing on the 
psychodynamic conflicts created by uncon- 
scious drives and urges. These perspectives 
quickly spread to other areas of psychology. 
Early pioneers of experimental social psy- 
chology gave primary emphasis to motiva- 
tional variables such as drives, goals, and as- 
pirations (e.g., Allport, 1920; Lewin, 1935). 
The study of personality came to involve the 
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types of needs and motives (e.g., Murray, 
1938). Even research on sensory and percep- 
tual processes was influenced by a motiva- 
tional approach with the emergence of the 
“New Look” school (e.g., McGinnies, 1949). 

After this early period of growth and 
expansion, however, research and theory 
on motivated thinking became quite con- 
troversial. With the ascendance of cogni- 
tive perspectives on thinking and reason- 
ing in the 1960s and 1970s, many supposed 
instances of motivated reasoning were re- 
cast as merely a product of imperfect infor- 
mation processing by imperfect perceivers 
(compare Bruner, 1957, with McGinnies, 
1949; Festinger, 1957, with Bem, 1967; 
Bradley, 1978, with Nisbett & Ross, 1980). 
The various “motivation versus cognition” 
debates that subsequently developed contin- 
ued off and on for years before they were 
declared not only unwinnable but also coun- 
terproductive. An uneasy armistice was de- 
clared (Tetlock & Levi, 1982) that effec- 
tively quieted the public conflict but did 
nothing to reconcile the deep conceptual 
differences that still remained between re- 
searchers favoring cognitive or motivational 
perspectives. 

Following this period of conflict, enthu- 
siasm for questions concerning motivational 
influences on thinking was dampened in the 
1970s and early 1980s. Beginning in the late 
1980s, however, there was a resurgence of 
interest in this area (for recent reviews and 
overviews, see, Dunning, 1999; Gollwitzer 
& Bargh, 1996; Higgins & Molden, 2003; 
Kruglanski, 1996; Kunda, 1990; Sorrentino 
& Higgins, 1986). One reason for this new 
life is that instead of revisiting debates about 
the workings of motivational versus cog- 
nitive processes, researchers began to ex- 
amine the important interactions between 
these two processes. Thus, more recent 
investigations have focused on the iden- 
tification of principles that describe the 
interface between motivation and cogni- 
tion, and the implications of this interface 
for thinking, reasoning, and judgment (see 
Kruglanski, 1996; Kunda, 1990; Higgins & 
Molden, 2003). 


“second generation” of research on moti- 
vated thinking and discusses some of the 
larger principles that have emerged from the 
study of the motivation/cognition interface. 
We consider two general classes of motiva- 
tional influences; the first involves people’s 
desires for reaching certain types of outcomes 
in their judgments, and the second involves 
people’s desires to use certain types of strate- 
gies while forming their judgments. In so do- 
ing, we adopt a rather broad focus and dis- 
cuss several different varieties of motivated 
thinking. Given space constraints, this broad 
focus necessitates being selective in the phe- 
nomena to be described. We have chosen 
those programs of research that we believe 
are representative of the larger literature and 
are especially relevant not only to the study 
of reasoning but also to other areas in cog- 
nitive psychology.’ After reviewing the sep- 
arate influences on thinking of outcome- 
and strategy-based motivations, we conclude 
by suggesting potential directions for future 
research, giving special attention to circum- 
stances in which multiple sources of motiva- 
tion might operate simultaneously. 


Outcome-Motivated Thinking 


The most prominent approach to motivated 
reasoning, in both classic and contemporary 
perspectives, has been to examine the influ- 
ence on people’s thought processes of their 
needs, preferences, and goals to reach desired 
outcomes (or avoid undesired outcomes). 
Although the types of preferred outcomes 
that have been studied are highly diverse, 
they can be divided into two general classes: 
directional outcomes and nondirectional out- 
comes (see Kruglanski, 1996; Kunda, 1990). 
Individuals who are motivated by directional 
outcomes are interested in reaching specific 
desired conclusions, such as impressions of 
themselves as intelligent, caring, and worthy 
people (e.g., Dunning, 1999; Pyszczynski & 
Greenberg, 1987), or positive beliefs about 
others whom they find likeable or to whom 
they are especially close (e.g., Murray, 1999). 
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by nondirectional outcomes have more gen- 
eral concerns, such as reaching the most 
accurate conclusion possible (e.g., Fiske & 
Neuberg, 1990) or making a clear and con- 
cise decision (e.g., Kruglanski & Webster, 
1996), whatever this conclusion or decision 
may be. 

Whether outcome motivation is direc- 
tional or nondirectional, however, this moti- 
vation has been conceptualized as affecting 
thought and reasoning in the same way: by 
directing people’s cognitive processes (e.g., 
their recall, information search, or attribu- 
tions) in ways that help to ensure they reach 
their desired conclusions. That is, individu- 
als’ preferences for certain outcomes are be- 
lieved to often shape their thinking so as to 
all but guarantee that they find a way to be- 
lieve, decide, and justify whatever they like. 
In this section, we review several programs 
of research that have more closely examined 
the specific mechanisms by which this can 
occur — first in relation to motivations for 
directional outcomes and then in relation 
to motivations for nondirectional outcomes. 
Following this, we discuss several limitations 
of the effects of outcome motivation on rea- 
soning and identify circumstances in which 
these motivations are most likely to have 
an impact. 


Influences of Directional 
Outcome Motivation 


Overall, the kinds of phenomena that have 
been studied most extensively in research on 
motivated thinking involve directional out- 
come preferences (i.e., individuals’ desires to 
reach specific conclusions about themselves 
and others; for reviews, see Dunning, 1999; 
Kunda, 1990; Murray, 1999; Pyszczynski & 
Greenberg, 1987). Although a variety of out- 
comes have been investigated, people’s well- 
documented preference for viewing them- 
selves, and those close to them, in a generally 
positive manner (see Baumeister, 1998) has, 
by far, received the most attention. This out- 
come is the primary focus here.’ In the next 
sections, we review several effects of desires 
for positive self-evaluation involving many 


bution, evaluation of evidence, information 
search, recall and knowledge activation, and 
the organization of concepts in memory. 


EFFECTS ON ATTRIBUTION 


Some of the first evidence for the effects 
on reasoning of motivations for positive self- 
evaluation grew out of work on attribution 
(see Kelley, 1973). Early attributional re- 
search found that when people were ex- 
plaining their performance on tasks measur- 
ing important abilities they tended to take 
responsibility for their success (i.e., cite in- 
ternal and stable causes, such as “I’m good 
at this task”) and to deny responsibility for 
their failure (i.e, cite external and unstable 
causes, such as “I was unlucky”). Such find- 
ings were typically described as stemming 
from desires for positive beliefs about the 
self (for a review, see Bradley, 1978). 

The motivational nature of these find- 
ings was questioned, however. Several re- 
searchers (e.g., Nisbett & Ross, 1980) ar- 
gued that although one’s attributions may 
sometimes be biased, this does not neces- 
sarily imply that motivational forces are at 
work (e.g., previous expectancies for success 
could lead people to label an unexpected 
failure as unusual or unlucky). Yet, subse- 
quent research has found that, although peo- 
ple’s expectancies do play a role in these 
attributional effects, there is substantial ev- 
idence that motivation plays an important 
role as well (see Kunda, 1990; Pyszczynski 
& Greenberg, 1987). 

One type of evidence for the role of 
motivation in self-serving attributions is 
that, independent of expectancies from prior 
success or failure, the more personally im- 
portant a success is in any given situation, 
the stronger is the tendency to claim respon- 
sibility for this success but to deny responsi- 
bility for failure (Miller, 1976). Another type 
of evidence is that people’s attributions be- 
come increasingly self-serving when success 
or failure feedback is experienced as highly 
arousing. For instance, Gollwitzer, Earle, and 
Stephan (1982) had participants first com- 
plete an intelligence test, then vigorously 
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being scored (increasing their arousal), and 
finally, receive feedback about succeeding 
or failing on the test. Feedback was given 
1 minute, 5 minutes, or 9 minutes after riding 
the bicycle. Both those receiving feedback 
after 9 minutes, who were no longer aroused, 
and those receiving feedback after 1 minute, 
who were aroused but still associated this 
arousal with the exercise, showed only 
small attributional differences following suc- 
cess versus failure feedback. In contrast, 
those receiving feedback after 5 minutes, 
who were still aroused but no longer asso- 
ciated this with the exercise, misattributed 
their arousal to the feedback concerning 
the test and showed a strong tendency to 
credit their ability for success and blame 
bad luck for failure (see also Stephan & 
Gollwitzer, 1981). 


EFFECTS ON EVIDENCE EVALUATION 


Similar to these attribution effects, more re- 
cent research has found that motivations for 
positive self-evaluations also influence the 
way in which people evaluate information 
that either supports or contradicts these pos- 
itive self-evaluations. In general, individuals 
tend to (1) give more credence to, and be 
more optimistic about, the validity of infor- 
mation that supports or confirms their stand- 
ing as kind, competent, and healthy people; 
and (2) be more skeptical and cautious about 
information that threatens this standing. 
An example of the first type of influence 
can be found in a study by Ditto, Scepansky, 
Munro, Apanovitch, and Lockhart (1998). 
Individuals were “tested” for the presence of 
a fictitious enzyme in the body, TAA, and 
everyone was told that they had tested pos- 
itive. Half of the people were informed that 
this had positive health consequences, and 
half were informed that this had negative 
health consequences. Those who believed 
TAA had negative health consequences were 
largely dismissive of the test when told it 
was slightly unreliable (i.e., had a 10% false- 
positive rate) and judged the result to be 
only somewhat more valid when told the test 
was highly reliable (i.e., had a .05% false- 


had positive health consequences, however, 
judged the test to be highly valid regardless 
of its reliability (see also Doosje, Spears, & 
Koomen, 1995). 

An example of the second type of in- 
fluence can be found in a study by Kunda 
(1987). Participants read a scientific article 
reporting that caffeine consumption was re- 
lated to serious health problems in women. 
Afterward, women (but not men) who were 
heavy caffeine consumers reported that the 
article was less convincing than women who 
were light caffeine consumers. In a follow- 
up study in which people read a similar 
article that revealed caffeine caused only 
mild health problems, there was no rela- 
tion between their evaluation of the ar- 
ticle and their caffeine consumption. Be- 
cause, in both studies, people’s reasoning 
was altered only when there was a signif- 
icant threat to the self, this demonstrates 
the motivational nature of these results (see 
also Beauregard & Dunning, 1998; Ditto 
et al., 1998). 

Similar effects of people’s desire to view 
themselves positively have also been demon- 
strated in domains that do not directly in- 
volve health consequences. For instance, 
people who encounter scientific research 
that appears to support their cherished at- 
titudes describe this research as being bet- 
ter conducted, and its conclusions as being 
more valid, than those who encounter the 
same research but believe it to be in conflict 
with their cherished attitudes (e.g., Lord, 
Ross, & Lepper, 1979). In addition, people 
have been shown to engage in considerable 
counterfactual thinking (i.e, mentally un- 
doing the present state of affairs by imag- 
ining “if only...”; see Roese, 1997) when 
evidence supporting predictions from a pre- 
ferred theory or worldview fails to materi- 
alize. Such counterfactual thinking allows 
them to generate ways in which they were 
almost correct. However, when evidence is 
consistent with their theories, these same 
individuals do not engage in counterfactual 
thinking, which would force them to gener- 
ate ways in which they were almost wrong 
(Tetlock, 1998). 
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The motivational influences discussed thus 
far center on the quality of people’s in- 
formation processing during reasoning (e.g., 
biased attributions, more or less critical 
evaluations). However, desires for positive 
self-evaluations also affect the quantity of 
people’s information processing (Kruglan- 
ski, 1996). Specifically, such desires moti- 
vate decreased processing and quick accep- 
tance of favorable evidence and increased 
processing and hesitant acceptance of unfa- 
vorable evidence. As one example, Ditto and 
colleagues (Ditto & Lopez, 1992; Ditto etal., 
1998) demonstrated that, compared with 
evaluating favorable evidence, when people 
evaluate unfavorable evidence they spend a 
greater amount of time examining this evi- 
dence and spontaneously generate more al- 
ternate hypotheses about why it might be 
unreliable (see also Pyszczynski & Green- 
berg, 1987). Moreover, they have also shown 
that individuals who are prevented from 
putting this extra cognitive effort into the 
examination of unfavorable evidence (e.g., 
participants who are placed under cognitive 
load) return evaluations that are substan- 
tially less critical. 

Additional evidence of increased infor- 
mation processing of information that is in- 
consistent with preferred conclusions comes 
from Chaiken and colleagues (Giner-Sorolla 
& Chaiken, 1997; Liberman & Chaiken, 
1992). In one experiment, for example, peo- 
ple read scientific reports claiming that there 
was either a strong link or a weak link 
between caffeine consumption and signi- 
ficant health risks similar to the Kunda 
(1987) studies discussed earlier. As before, 
the group of women who were the most 
threatened by this information were the least 
convinced by the reports. In addition, the 
study found that the most threatened group 
of participants also expended the most ef- 
fort to find flaws in the studies described and 
identified the most weaknesses. 


EFFECTS ON RECALL AND KNOWLEDGE ACTIVATION 


In addition to affecting the appraisal and 
encoding of new information, people’s de- 
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certain well-liked others) have also been 
found to influence their use of stored knowl- 
edge in memory such as the selective ac- 
tivation of concepts and recall of events 
that support these views. These phenom- 
ena are exemplified in a series of studies by 
Santioso, Kunda, and Fong (i990). Partici- 
pants in these studies read fictitious articles 
revealing that either introverts or extroverts 
tend to have more academic and professional 
success. Following this, individuals who be- 
lieved that introversion was linked to success 
were more likely to recall, and were faster 
to recall, autobiographical instances of intro- 
verted behaviors than extroverted behaviors. 
The opposite pattern of results was found for 
individuals who believed that extroversion 
was linked to success. 

More recent work has demonstrated that, 
in addition to creating selective recall, direc- 
tional outcome motivation can also lead to 
the reconstruction of previous memories. For 
instance, McDonald and Hirt (1997) showed 
people a videotape of a fellow college stu- 
dent who was portrayed as either likeable 
or unlikable. They then provided some ad- 
ditional information about the target, in- 
cluding his midterm scores in several classes. 
Later, when the target’s scores on his final 
exams were revealed, those who found the 
target likeable remembered some of the tar- 
get’s midterm scores as lower than they ac- 
tually were in order to make the final scores 
more consistent with improvement. In con- 
trast, those who found the target unlikable 
remembered some of the midterm scores as 
higher than they actually were in order to 
make the final scores more consistent with 
decline (see also Conway & Ross, 1984). 

Finally, besides influencing explicit recall, 
motivations to reach specific preferred con- 
clusions also influence more implicit pro- 
cesses, such as knowledge activation and 
accessibility. In one demonstration of this 
(Sinclair & Kunda, 1999), individuals either 
received positive or negative feedback from 
a person who was a member of multiple 
social categories. One of these social cate- 
gories (doctor) was associated with mostly 
positive stereotypes and another (African 
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ative stereotypes. Those who had received 
positive feedback from the other person 
were faster than baseline to identify doctor- 
related words and slower than baseline to 
identify African American-related words on 
a lexical-decision task. Those who had re- 
ceived negative feedback showed a reverse 
pattern of activation (see also Spencer, Fein, 
Wolfe, Hodgson, & Dunn, 1998; for a re- 
versal of these effects when people are 
motivated by egalitarian rather than self- 
serving outcomes, see Moskowitz, Goll- 
witzer, Wasel, & Schaal, 1999). 


EFFECTS ON ORGANIZATION OF CONCEPTS 

IN MEMORY 

Finally, beyond affecting the activation of 
knowledge from memory, motivation for di- 
rectional outcomes can also influence the 
way in which people come to organize this 
knowledge. The most widely studied exam- 
ple of this concerns how desires for positive 
self-evaluation lead people to form stronger 
associations between their self-concepts and 
attributes that they feel are praiseworthy or 
related to success. Three primary strategies 
by which people accomplish this have been 
identified: (1) altering one’s self-concept 
to include attributes that are believed to 
bring about successful outcomes (e.g., Klein 
& Kunda, 1992; Kunda & Santioso, 1989); 
(2) coming to view the attributes that one 
already possesses as essential for success- 
ful outcomes (Dunning, Leuenberger, & 
Sherman, 1995; Dunning, Perie, & Story, 
1991; Kunda, 1987); and (3) redefining the 
criteria that must be met before one can 
be considered successful or in possession 
of particular positive and negative qualities 
(Beauregard & Dunning, 1998; Dunning & 
Cohen, 1992; see also Alicke, LoSchiavo, 
Zerbst, & Zhang, 1997). 

The second two strategies are of particu- 
lar relevance to the issue of knowledge or 
ganization. Use of the second strategy can 
clearly be seen in a program of research by 
Dunning and his colleagues (Dunning et al., 
1995; Dunning et al., 1991). In one study, 
people who considered themselves either 
more goal-oriented or more people-oriented 


their own orientation (e.g., determined in the 
former case versus dependable in the latter) 
as more prototypical of successful leaders 
(see also Kunda, 1987). In another study, in- 
dividuals rated their own characteristics as 
more prototypical of positive qualities such 
as intelligence but as less prototypical of neg- 
ative qualities such as aloofness. 

Use of the third strategy can be seen 
in another series of experiments by Dun- 
ning and his colleagues (Beauregard & Dun- 
ning, 1998; Dunning & Cohen, 1992; see also 
Alicke et al., 1997). Participants in these ex- 
periments were asked to judge the abilities 
of others in several domains (e.g., math, ath- 
letics). When participants themselves were 
highly skilled in the domain they were con- 
sidering or had just experienced a relevant 
personal success, they set higher perfor- 
mance standards for others. That is, to dis- 
tinguish their own superiority, they judged 
others as less successful. However, when par- 
ticipants themselves were not highly skilled 
in the domain they were considering or had 
just experienced a relevant personal failure, 
they set lower performance standards for 
others. That is, to cast those outperform- 
ing them as relatively high achievers, they 
judged them as more successful. 

In sum, motivations for directional out- 
comes can affect basic cognitive processes 
and influence thinking in several profound 
ways. These types of motivations affect not 
only how people search for, evaluate, and ex- 
plain information in the world around them 
but also how they activate, access, and orga- 
nize their knowledge about themselves and 
others. The next section reviews research in- 
dicating that motivations for nondirectional 
outcomes can be equally important. 


Influences of Nondirectional 
Outcome Motivation 


Although less research exists concerning the 
cognitive effects of nondirectional outcome 
motivation, several varieties have been con- 
sidered in some depth (e.g., Cacioppo, Petty, 
Feinstein, & Jarvis, 1996; Fiske & Neuberg, 
1990; Kruglanski & Webster, 1996; Lerner & 
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prominent are desires for accuracy (Fiske & 
Neuberg, 1990) and desires for clarity and 
conciseness, or closure (Kruglanski & Web- 
ster, 1996). Here, we consider the effects of 
these two motivations (which, as will be dis- 
cussed, often have opposing effects on in- 
formation processing) on many of the same 
cognitive processes examined in the previ- 
ous section. 

Before beginning, however, it should be 
noted that both accuracy and closure mo- 
tivation have been operationalized in multi- 
ple ways. For example, motivations for accu- 
racy have been studied in terms of wanting to 
know as much as possible about a person on 
whom one is going to be dependent (Neu- 
berg & Fiske, 1987), feelings of accountabil- 
ity for one’s judgments (e.g., Tetlock, 1983), 
a “fear of invalidity” (e.g., Kruglanski & Fre- 
und, 1983), and simple desires to be as cor- 
rect as possible (e.g., Neuberg, 1989). Mo- 
tivations for closure have been examined in 
terms of feelings of time pressure (Kruglan- 
ski & Freund, 1983), a desire to quickly 
complete judgment tasks that are dull and 
unattractive (Webster, 1993), and desires 
to escape noisy environments (Kruglanski, 
Webster, & Klem, 1993; see Kruglanski & 
Webster, 1996). In the initial discussion pre- 
sented, each of these varieties of accuracy or 
closure motivation are treated as equivalent; 
some important differences among the ef- 
fects of these various operationalizations are 
considered at the end. 


EFFECTS ON ATTRIBUTION 


In addition to self-serving biases that oc- 
cur when people explain their own perfor- 
mance, as described previously, research on 
attribution has also identified more general 
biases. For example, there is the tendency for 
people to fixate on one particular cause for 
some action or event and then fail to ade- 
quately consider alternative causes that are 
also possible (see Gilbert & Malone, 1995; 
see also Buehner & Cheng, Chap. 7; Kahne- 
man & Frederick, Chap. 12). Although these 
attributional biases have been largely con- 
sidered from a purely cognitive standpoint, 


also be influenced by accuracy and closure 
motivations. 

In one study, Tetlock (1985) had par- 
ticipants read an essay either supporting 
or opposing affirmative action that had os- 
tensibly been written by someone from a 
previous experiment. They were then in- 
formed that the author of the essay had been 
assigned to take this position by the exper 
imenter and asked to judge the extent to 
which the arguments presented in the essay 
reflected the author’s own attitude. People 
who were not provided with any additional 
motivations displayed the typical fixation on 
a single cause. These individuals reported 
that the position taken in the supportive es- 
say could be explained by the positive atti- 
tude of the author toward affirmative action, 
whereas the position taken in the oppos- 
ing essay could be explained by the nega- 
tive attitude of the author toward affirma- 
tive action despite knowing that both essays 
had been largely coerced by the experi- 
menter. However, people who were moti- 
vated to make accurate judgments (by in- 
forming them that they would later be 
discussing the reasons for their impressions 
with the experimenter) did consider the al- 
ternative cause represented by the experi- 
menter’s coercion. These individuals judged 
the attitude of the author to be neutral re- 
gardless of which essay they read. A study 
by Webster (1993) using a similar paradigm 
showed that, in contrast, when participants’ 
motivation for closure was increased, the 
typical fixation on a single cause became 
even more pronounced. Thus, a need for ac- 
curacy and a need for closure appear to have 
opposite effects on people’s considerations 
of alternate causes during attribution (see 
Kruglanski & Freund, 1983; Kruglanski & 
Webster, 1996). 


EFFECTS ON EVIDENCE EVALUATION AND 
INFORMATION SEARCH 

As discussed earlier, research on directional 
outcome motivation has demonstrated that 
people engage in increased evidence eval- 
uations and prolonged information search 
when encountering evidence unfavorable 
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evidence evaluation and information search 
when encountering evidence favorable to 
their preferred self-views. In contrast, ac- 
curacy motivation produces prolonged in- 
formation search, and closure motivation 
produces reduced information search, re- 
gardless of the circumstances. 

This consequence of accuracy motivation 
is evident in a study by Neuberg (i989), 
where people were asked to conduct a tele- 
phone interview with a peer but were given 
unfavorable expectations concerning the in- 
terviewee. Those participants who were in- 
structed to “form the most accurate impres- 
sions possible” of the other person spent 
more time listening and provided more op- 
portunities for the interviewee to elaborate 
his or her opinions. This in turn prevented 
their unfavorable expectations from creat- 
ing negative final impressions of the inter- 
viewee, which is what occurred with those 
participants who were not given any special 
instructions for the interview. 

Similar consequences of accuracy moti- 
vation are also seen in research by Chaiken 
and colleagues (for reviews, see Chen & 
Chaiken, 1999; Eagly & Chaiken, 1993). 
For example, in one study by Maheswaran 
and Chaiken (1991), participants evaluated 
a product based on a detailed review that 
described this product more favorably or 
less favorably than similar products. Partic- 
ipants who were high in accuracy motiva- 
tion, because they believed their evaluations 
would have important consequences gener- 
ated more thoughts about the strengths and 
weaknesses of the specific product-quality 
arguments that were listed in the review 
than did those who were low in accuracy 
motivation. This again attenuated any ef- 
fects of people’s prior expectations on their 
final evaluations. 

The consequences of closure motiva- 
tion on evidence evaluation and information 
search has been shown in several studies by 
Kruglanski et al. (1993). People were paired 
with someone else for a discussion about 
the verdict of a mock trial. Before the dis- 
cussion, everyone received a summarized le- 
gal analysis of the case which, unbeknownst 


a different verdict for each member of the 
pair. Participants with high (versus low) clo- 
sure motivation attempted to bring about a 
quick end to the discussion. Moreover, when 
asked before the discussion, they expressed a 
strong preference for a partner who could be 
easily persuaded to their existing viewpoint, 
and once the discussion began, they stub- 
bornly attempted to convince their partner 
to see things their way rather than consider- 
ing alternative arguments. 


EFFECTS ON EVALUATION COMPLEXITY 


In addition to affecting the length of peo- 
ple’s analysis and evaluation of evidence, 
nondirectional outcome motivation can also 
influence the complexity of this analysis. 
Accuracy-motivated individuals form judg- 
ments that show greater consideration of 
conflicting opinions and evidence, whereas 
closure-motivated individuals form judg- 
ments that show less of this type of con- 
sideration. Tetlock and colleagues demon- 
strated these effects in experiments in which 
participants were asked to write down their 
thoughts about topics such as affirmative ac- 
tion, American foreign policy, and the causes 
of certain historical events (for a review, see 
Lerner & Tetlock, 1999). Responses were 
then coded for their integrative complexity, 
which was defined in terms of the degree 
to which multiple perspectives on an issue 
were both identified and then integrated into 
a framework that included complex con- 
nections between them. Findings with peo- 
ple who were both novices and experts on 
the issues they were analyzing (i.e, college 
students and professional historians, respec- 
tively) indicated that those with increased 
accuracy motivation provided a more in- 
tegratively complex analysis (e.g., Tetlock, 
1983), whereas those with increased clo- 
sure motivation provided a less integratively 
complex analysis (Tetlock, 1998). 


EFFECTS ON RECALL AND KNOWLEDGE ACTIVATION 


Whereas directional outcome motivation 
was seen earlier to have qualitative ef- 
fects on recall and knowledge activation, 
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largely quantitative effects. Once again, ac- 
curacy motivation and closure motivation 
have opposite influences. 

In an investigation of accuracy moti- 
vation on recall during impression forma- 
tion, Berscheid and colleagues found that 
when people observed interviews involv- 
ing individuals with whom they might later 
be paired, they paid more attention to 
the interview and remembered more infor- 
mation about the interviewees than when 
they did not expect any future interactions 
(Berscheid, Graziano, Monson, & Dermer, 
1976; see also Srull, Lichtenstein, & Roth- 
bart, 1985). However, in studies of closure 
motivation and impression formation, indi- 
viduals with chronically high (versus low) 
need for closure spent less time reading dif- 
ferent pieces of behavioral information they 
were given about a target and later recalled 
fewer of these behaviors (Dijksterhuis, van 
Knippenberg, Kruglanski, & Schaper, 1996). 
There is also evidence that people with high 
(versus low) accuracy motivation activate 
more pieces of individuating trait and behav- 
ioral information when forming impressions 
of others (Kruglanski & Freund, 1983; Neu- 
berg & Fiske, 1987), whereas people with 
high (versus low) need for closure display 
an increased tendency to rely solely on cat- 
egorical information during impression for- 
mation (Dijksterhuis et al., 1996; Kruglanski 
& Freund, 1983; see also Moskowitz, 1993). 

Similar effects are found for the use 
of highly accessible knowledge structures 
or attitudes in judgment. In typical cir- 
cumstances, concepts or attitudes that have 
been recently or frequently activated will 
lead people to assimilate their judgments 
to this highly accessible information with- 
out considering any additional information 
(see Fazio, 1995; Higgins, 1996). Increased 
accuracy motivation can attenuate assimila- 
tion effects by increasing the activation of al- 
ternative interpretations, whereas increased 
closure motivation can exacerbate assimila- 
tion effects by decreasing the activation of al- 
ternative interpretations. For example, when 
evaluating the behavior of a target person 
who was ambiguously adventurous or reck- 


whichever one of these concepts was most 
accessible to a greater extent when their 
closure motivation was high but to a lesser 
extent when their accuracy motivation was 
high (Ford & Kruglanski, 1995; Thompson 
et al., 1994). These effects have been found 
both when people are making online judg- 
ments (Kruglanski & Freund, 1983; Schuette 
& Fazio, 1995) and when they are recon- 
sidering previously encountered information 
(Sanbonmatsu & Fazio, 1990; Thompson 
et al., 1994). 

Overall, then, motivations for nondirec- 
tional outcomes can also affect basic cog- 
nitive processes and profoundly influence 
thinking. Whereas motivations for direc- 
tional outcomes were earlier shown to alter 
how people activate, evaluate, and explain 
information during reasoning, motivations 
for nondirectional outcomes (at least in 
terms of the accuracy and closure moti- 
vations reviewed here) instead alter how 
much activation, evaluation, or explanation, 
in fact, occurs. Furthermore, as the findings 
presented here illustrate, such quantitative 
differences in thought can often affect the 
outcomes of people’s judgments and deci- 
sions just as much as the qualitative differ- 
ences described previously. 


Limits to Outcome-Motivated Thinking 


Although, so far, people have been shown to 
have an impressive array of cognitive mech- 
anisms at their disposal when attempting 
to reach desired conclusions, limits do ex- 
ist concerning when these mechanisms are 
applied. These limits are first described for 
directional outcome-motivated thinking and 
then for nondirectional outcome-motivated 
thinking. 


REALITY CONSTRAINTS ON MOTIVATIONS FOR 
DIRECTIONAL OUTCOMES 

Although there are often specific outcomes, 
such as positive self-views, that people 
have some preference for during judgment, 
most individuals still acknowledge there 
is some kind of “objective reality” about 
whatever information they are considering. 
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tional outcomes operates within what Kunda 
(1990) has called reality constraints (see also 
Pyszczynski & Greenberg, 1987; cf Kruglan- 
ski, 1999). Therefore, although there is a 
degree to which people adjust their defini- 
tions of success, engage in selective recall, 
or seek to criticize unfavorable evidence, 
this does not make them entirely unrespon- 
sive to world around them, except perhaps 
in extreme circumstances (see Bachman & 
Cannon, Chap. 21). 

Indeed, evidence for this principle of re- 
ality constraints has been repeatedly found 
in the context of the research previously 
described. For example, a study using a 
paradigm discussed earlier, in which partic- 
ipants first learned that introverts or extro- 
verts were generally more successful before 
rating themselves on these traits, was per- 
formed using participants who had been pre- 
selected as having high trait levels of either 
introversion or extroversion (Santioso et al., 
1990). Although beliefs that one trait was 
more beneficial than the other increased ev- 
eryone’s self-ratings concerning that trait, 
demonstrating motivated reasoning, there 
was also a large effect of people’s chronic dis- 
positions. Introverts’ ratings of themselves, 
were always more introverted than extro- 
verts’ ratings of themselves, no matter how 
beneficial the introverts believed the trait of 
extroversion to be. That is, regardless of how 
desirable it would have been, introverts did 
not suddenly believe themselves to be extro- 
verts and vice versa. 

Another example of the influence of 
reality constraints is that people’s think- 
ing is guided by their preferred outcomes 
to a much greater extent in situations of 
uncertainty (e.g., Dunning, Meyerowitz, & 
Holtzberg, 1989; Hsee, 1995). When there is 
more potential for constructing idiosyncratic 
criteria for a certain judgment (e.g., judging 
whether one possess somewhat vague traits 
such as sensibility or insecurity), then peo- 
ple use this opportunity to select criteria that 
allow them to reach their desired conclu- 
sion. However, when there is less potential 
for this construction (e.g., judging whether 
one possesses more precise traits such as 


less motivated reasoning (Dunning et al., 
1989). Overall, these results suggest that 
thinking and reasoning inspired by direc- 
tional outcomes do not so much lead peo- 
ple to ignore the sometimes disappointing 
reality they face because it inspires them to 
exploit the uncertainties that exist in this re- 
ality to their favor. 


COGNITIVE-RESOURCE CONSTRAINTS ON 
ACCURACY MOTIVATION 

Virtually all the effects of accuracy motiva- 
tion reviewed here involve increases in the 
total amount of information processing that 
people perform during judgment. There- 
fore, in circumstances in which one’s abil- 
ity to engage in this information processing 
is constrained, the effects of increased accu- 
racy motivation should be minimal (Fiske & 
Neuberg, 1990). One demonstration of this 
was provided by Pendry and Macrae (1994). 
As described earlier, accuracy-motivated in- 
dividuals who were forming an impression 
of a target displayed an increased use of 
individuating trait and behavioral informa- 
tion when they possessed their full infor- 
mation processing resources (see Neuberg & 
Fiske, 1987). However, accuracy-motivated 
individuals whose processing resources were 
depleted based their impression primarily 
on categorical information in the same way 
as those who had little accuracy motiva- 
tion (see also Kruglanski & Freund, 1983). 
In addition, Sanbonmatsu and Fazio (1990) 
showed that the influence of accuracy moti- 
vation in reducing people’s assimilation of 
their judgments to highly accessible atti- 
tudes disappears when people are placed un- 
der time pressure, which prevents extended 
information processing. 


DOES MOTIVATION FOR ACCURACY RESULT IN 
ACCURATE REASONING? 

Another important consideration of the ef- 
fects of accuracy motivation on thinking and 
reasoning is that even when people high in 
accuracy motivation are free to engage in 
extended information processing this does 
not guarantee that they will arrive at more 
accurate judgments. One obvious example 
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yond what is immediately and effortlessly 
available does not exist or has faded from 
memory (see, e.g., Thompson et al., 1994). 
In an another manifestation, people are af- 
fected by certain biases outside their aware- 
ness or are aware of such biases but un- 
aware of what the proper strategy is to 
correct them. In all these circumstances, al- 
though accuracy motivation might increase 
information search, recall, and considera- 
tion of multiple interpretations, it would not 
be expected to eliminate judgment errors 
(Fischhoff, 1982), and might even increase 
them (Pelham & Neter, 1995; Tetlock & 
Boettger, 1989). 


DISTINCTIONS AMONG CIRCUMSTANCES THAT LEAD 
TO ACCURACY MOTIVATION 

As alluded to earlier, the different types 
of accuracy motivation inductions reviewed 
here are not always equivalent and can have 
markedly different effects. For example, al- 
though having one’s outcomes dependent on 
another person can increase desires for accu- 
racy in diagnosing that person’s true charac- 
ter (eg., Neuberg & Fiske, 1987), in other 
cases such circumstances can produce a de- 
sire to see a person that one is going to be 
depending on in the best possible light (e.g., 
Berscheid et al., 1976; Klein & Kunda, 1992; 
see Kruglanski, 1996). As another exam- 
ple, although believing that one’s judgment 
has important consequences may motivate 
an accurate consideration of all the relevant 
evidence, it could also motivate a more gen- 
eral need to increase elaborative thinking 
that is not necessarily focused on accuracy 
(see Footnote 3; Petty & Wegener, 1999). Fi- 
nally, although justifying one’s judgments to 
an audience can motivate accuracy when the 
opinion of the audience is unknown, it can 
also lead to more directional outcome mo- 
tivation, such as ingratiation toward this au- 
dience, when the opinion of the audience is 
known (Tetlock, 1983; see Lerner & Tetlock, 
1999). Therefore, when attempting to antic- 
ipate the effects of accuracy motivation on 
reasoning in a particular situation, it is im- 
portant to consider both the current source 


which it exists. 


THE INFLUENCE OF INFORMATION AVAILABILITY ON 
CLOSURE MOTIVATION 

Certain qualifications must also be noted in 
the effects of closure motivation. All the 
findings discussed so far have involved the 
tendency for people with increased closure 
motivation to quickly assimilate their judg- 
ments to readily available or highly acces- 
sible information, leading to an early “freez- 
ing” of their information search. However, in 
situations in which little information is avail- 
able, high closure motivation may inspire ef- 
forts to find something clear and concise to 
“seize” upon and increase information search 
(see Kruglanski & Webster, 1996). For exam- 
ple, in the Kruglanski et al. (1993) studies 
described previously that involved partners 
discussing the verdict of a mock trial, people 
with high closure motivation preferred eas- 
ily persuadable partners and were unwilling 
to consider alternative arguments only when 
they had enough information at their dis- 
posal (i.e., a summarized legal analysis) to 
form a clear initial impression. When these 
same individuals were not provided with the 
legal analysis and did not begin the discus- 
sion with a clear opinion, they expressed a 
desire to be paired with someone who was 
highly persuasive and shifted toward their 
partner’s point of view. 


Conclusions on Outcome-Motivated 
Thinking 


Recent research has uncovered many po- 
tential routes by which people’s desires for 
particular judgment outcomes can affect 
their thinking and reasoning. To summa- 
rize, both directional outcome motivations, 
where people have a specific preferred con- 
clusion they are trying to reach, and nondi- 
rectional outcome motivations, where peo- 
ple’s preferred conclusions are more general, 
alter many basic cognitive processes during 
reasoning. These include (1) the explana- 
tion of events and behaviors; (2) the or- 
ganization, recall, and activation of knowl- 
edge in memory; and (3) the pursuit and 


306 THE CAMBRIDGE HANDBOOK OF THINKING AND REASONING 


evaluation oPedHientale devant pss /MetisiianaryRayalatory focus theory distinguishes be- 


making. Outcome motivation effects involve 
both how such cognitive processes are initi- 
ated and directed as well as how thoroughly 
these processes are implemented. Moreover, 
in any given situation the specific cognitive 
processes influenced by outcome motivation 
are typically those that aid the gathering and 
interpretation of information supporting the 
favored outcome. In this self-fulfilling way, 
then, people’s outcome-motivated reason- 
ing often successfully brings about their de- 
sired conclusions. 


Strategy-Motivated Thinking 


Although outcome-motivated thinking has 
been the most widely studied form of mo- 
tivated reasoning, other varieties of motiva- 
tional influences on cognition are also pos- 
sible. One alternate perspective that has 
more recently emerged and complements 
an outcome-based view proposes that peo- 
ple are motivated not only with respect to 
the outcomes of their judgments but also 
with respect to the manner in which they 
go about making these judgments. That is, 
not only do people have preferred conclu- 
sions, but they also have preferred strategies 
for reaching their conclusions (Higgins & 
Molden, 2003; cf Tyler & Blader, 2000). 
Therefore, independent of whatever out- 
come holds the most interest for them, 
people may be motivated to reach these 
outcomes using strategies that “feel right” 
in terms of, and allow them to sustain, 
their current motivational orientation (eg., 
eagerly gathering evidence that might sup- 
port a positive self-view or facilitate cog- 
nitive closure versus vigilantly suppressing 
evidence that could undermine a positive 
self-view or threaten cognitive closure). 
Several lines of research have examined 
how motivations for particular judgment 
strategies can also influence people’s ba- 
sic cognitive processes. In the vast majority 
of these studies, strategic motivations were 
measured and manipulated in terms of peo- 
ple’s regulatory focus (see Higgins, 1997). 


tween two basic motivational orientations: 
a promotion focus involving concerns with 
advancement and approaching gains versus 
avoiding nongains, and a prevention focus 
involving concerns with security and ap- 
proaching nonlosses versus avoiding losses. 
Because it centers on the presence and ab- 
sence of positive outcomes, a promotion fo- 
cus has been found to create preferences 
for eager judgment strategies that empha- 
size advancement (or, to use signal detec- 
tion terminology, finding hits) and ensure 
against overlooking something that might 
be important (or, to again use signal de- 
tection terminology, avoiding errors of omis- 
sion). In contrast, because it centers on the 
presence and absence of negative outcomes, 
a prevention focus has been found to engen- 
der preferences for vigilant judgment strate- 
gies that emphasize protection (or making 
correct rejections) and ensure against com- 
mitting to something that might be a mis- 
take (or avoiding errors of commission; see 
Higgins & Molden, 2003). Therefore, even 
in circumstances in which individuals are 
pursuing the same outcome, they may show 
marked differences in their pursuit of this 
outcome depending upon whether they are 
currently promotion focused or prevention 
focused. The studies reviewed here are in- 
tended to illustrate the effects of eager 
or vigilant strategic motivation on several 
types of thought processes similar to those 
found to be influenced by outcome moti- 
vation (for a larger overview, see Higgins & 
Molden, 2003). 


EFFECTS ON THE CONSIDERATION OF 
ALTERNATIVE HYPOTHESES 

Considering alternative hypotheses is a fun- 
damental component of many varieties of 
thinking (see Sloman & Lagnado, Chap. 5). 
How might eager versus vigilant strategic 
preferences influence this process? In gen- 
eral, an eager strategy of considering alter- 
natives would involve attempting to attain 
hits and to ensure against errors of omission 
by generating and selecting any plausible 
hypotheses that could remotely be correct. 
However, a vigilant strategy of considering 
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make correct rejections and to ensure against 
errors of commission by generating and se- 
lecting only the most probable hypotheses 
that seem likely to be correct. Therefore peo- 
ple in a promotion focus would be expected 
to consider a greater number of alternatives 
during thinking and reasoning than people 
in a prevention focus. 

This question was addressed in several 
studies by Liberman, Molden, Idson, and 
Higgins (2001). One important instance of 
considering alternatives occurs when people 
form hypotheses about what they are per- 
ceiving (see Tversky, Chap. 10). Therefore, 
Liberman et al. (2001) examined the effects 
of people’s strategic preferences on a task 
where people identified vague and distorted 
objects in a series of photographs. Across 
several studies in which a promotion or pre- 
vention focus was both measured as an indi- 
vidual differences variable and induced ex- 
perimentally, results indicated that those ina 
promotion focus generated a greater number 
of alternatives for the identity of the objects 
than those in a prevention focus (see also 
Crowe & Higgins, 1997). 

In addition to examining the effects of 
strategic preferences on generating alterna- 
tive hypotheses for object perception, Liber- 
man et al. (2001) also investigated whether 
similar effects occurred for social percep- 
tion. Participants read a scenario describing 
the helpful behavior of a target person and 
were asked to evaluate several equally plau- 
sible alternative explanations for this behav- 
ior. Consistent with the results described 
previously, participants in a promotion fo- 
cus again selected a greater number of al- 
ternative explanations than participants in 
a prevention focus. Moreover, these effects 
were also found to influence the general im- 
pressions people formed of the target. Af- 
ter selecting their reasons for the target’s 
helpful behavior, participants predicted how 
helpfully he or she would behave in the fu- 
ture. Those in a promotion focus, because 
they were considering more interpretations 
of a target’s behavior, formed more equivo- 
cal impressions and showed relatively little 
generalization about the target’s behavior as 


(see Kelley, 1973). 

Finally, additional research by Molden 
and Higgins (2004) has more recently 
demonstrated similar effects for eager ver- 
sus vigilant strategic preferences on the gen- 
eration and selection of alternatives during 
basic categorization processes. People were 
given vague descriptions of a target person 
from which it was not clear how to cate- 
gorize him or her correctly, and a number 
of alternatives could all have been possible. 
As before, participants with either a chronic 
or experimentally induced promotion focus 
generated more possible categories for the 
target than those with either a chronic or 
experimentally induced prevention focus. 

Overall, then, people’s eager versus vig- 
ilant strategic preferences play a significant 
role in their generation of alternatives during 
a number of important thought processes. 
Moreover, it is important to note that in all 
the studies described in this section, every- 
one was pursuing the exact same outcome 
(identifying an object, explaining behaviors) 
and did not have motivations for any specific 
conclusion or end-state. Furthermore, mea- 
sures of people’s motivations for more gen- 
eral outcomes such as accuracy and closure 
were also taken, and these factors were sta- 
tistically removed from all analyses. There- 
fore, the observed effects of promotion or 
prevention motivational orientations are dis- 
tinct from the outcome motivation effects 
reviewed earlier and can be attributed to the 
influences of these orientations on people’s 
strategic preferences. 


EFFECTS ON COUNTERFACTUAL THINKING 


Besides generating and evaluating hypothe- 
ses, another way in which people consider al- 
ternatives during reasoning is in their use of 
counterfactuals. As briefly mentioned, ear- 
lier counterfactual thinking involves men- 
tally undoing the present state of affairs and 
imagining alternative realities “if only” dif- 
ferent decisions had been made or actions 
been taken (Roese, 1997). Several differ- 
ent varieties of counterfactual thinking have 
been identified. One broad distinction that 
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concern the reversal of a previous inaction 
(eg., if only I had acted, things might have 
gone better), or additive counterfactuals, and 
thoughts that concern the reversal of a pre- 
vious action (e.g., if only I hadn’t acted, 
things wouldn’t be so bad), or subtractive 
counterfactuals. 

Because additive counterfactuals simu- 
late the correction of a past error of omis- 
sion, this type of thinking represents a more 
eager strategy of considering alternative real- 
ities. In contrast, because subtractive coun- 
terfactuals simulate the correction of a past 
error of commission, this type of thinking 
represents a more vigilant strategy of con- 
sidering alternate realities. Therefore, a pro- 
motion focus should increase the generation 
of additive counterfactuals, and a preven- 
tion focus should increase the generation of 
subtractive counterfactuals. In line with this, 
Roese, Hur, and Pennington (i999) found 
that, both when analyzing hypothetical ex- 
amples and when describing particular in- 
stances of their own behavior, participants 
who considered promotion-related setbacks 
(i.e., nongains and missed opportunities for 
advancement) offered a greater number of 
additive counterfactuals, whereas partici- 
pants who considered prevention-related 
setbacks (i.e., losses and missed opportuni- 
ties to prevent mistakes) offered a greater 
number of subtractive counterfactuals. In 
the literature that exists on counterfactual 
thinking, it has been traditionally assumed 
that subtractive counterfactuals are more 
common than additive counterfactuals and 
that failures associated with action inspire 
more regret than failures associated with in- 
action (Roese, 1997). However, the results 
of these studies demonstrate that, in some 
cases, people’s strategic preferences can re- 
sult in additive counterfactuals being more 
common and perhaps being associated with 
greater regret (see also Camacho, Higgins, & 
Lugar, 2003). 

It is important to note that care was taken 
to make sure the outcomes that participants 
were considering in these studies did not dif- 
fer across any important dimensions such as 
how painful they were imagined to be or 


et al., 1999). Therefore, the results can again 
only be explained in terms of differences in 
strategic motivation. 


EFFECTS ON FAST VERSUS ACCURATE 
INFORMATION PROCESSING 
A major focus across many areas of psy- 
chology has been when and why people 
choose to emphasize either speed or ac- 
curacy in their thinking and decision mak- 
ing (e.g., Josephs & Hahn, 1995; Zelaznik, 
Mone, McCabe, & Thaman, 1988). Forster, 
Higgins, and Bianco (2003) more recently 
investigated whether promotion preferences 
for strategic eagerness would result in faster 
information processing and a higher quan- 
tity of output in a search for possible hits, 
whereas prevention preferences for strategic 
vigilance would result in more accurate in- 
formation processing and a higher quality of 
output in an effort to avoid mistakes. 
Participants were given a task involving 
four pictures taken from a children’s “con- 
nect the dots” drawing book. For each pic- 
ture, the objective was to connect sequen- 
tially numbered dots within a given time 
period in order to complete the outline of 
an image. Participants’ speed on each pic- 
ture was assessed by the highest number 
dot they reached by the end of the time 
period for that picture, and their accuracy 
on each picture was assessed by the num- 
ber of dots they skipped (i.e. that were 
not connected). Across two studies where 
participants’ promotion or prevention focus 
was both measured and experimentally in- 
duced, promotion-focused individuals were 
faster and produced a higher quantity of re- 
sponses, whereas prevention-focused indi- 
viduals were more accurate and produced 
a higher quality of responses over the entire 
task. Moreover, both of these tendencies in- 
creased in intensity as people moved closer 
to goal completion, resulting in stronger 
effects of strategic preferences toward the 
end of a task than toward the beginning of 
a task (i.e., the “goal looms larger” effect 
in which motivation increases as one’s dis- 
tance to the completion of a goal decreases; 
Lewin, 1935). This provides strong support 
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ment strategies can alter their concerns with 
different aspects of information processing 
(e.g., speed versus accuracy). 


EFFECTS ON KNOWLEDGE ACTIVATION AND RECALL 


Analogous to the selective recall and ac- 
tivation of information from memory that 
occurs in the presence of motivations for 
directional outcomes, another influence of 
strategic preferences on thinking is to in- 
crease sensitivities to, and recall of, informa- 
tion that that is particularly relevant to these 
preferences. A study by Higgins, Roney, 
Crowe, and Hymes (i994) demonstrated 
this by having participants read an essay 
about the life of a hypothetical target person 
in which two different types of situations 
were encountered. In one type of situation, 
the target used eager strategies that were ad- 
vancement oriented (e.g., waking up early 
in order to be on time for a favorite class), 
whereas in the other type of situation, the 
target used vigilant strategies that were more 
protection oriented (e.g., being careful not 
to sign up for a class whose schedule con- 
flicted with a desired activity). Individu- 
als who had chronic promotion orientations 
showed a stronger sensitivity for information 
related to advancement versus protection 
strategies and later showed greater recall for 
these episodes, whereas individuals who had 
chronic prevention orientations showed the 
reverse effect. 

Another study by Higgins and Tykocin- 
ski (1992), which again had people read an 
essay about the life of a hypothetical tar- 
get person, extends these findings. In this 
study, the target person experienced situa- 
tions that either involved the presence or 
absence of gains (finding $20 on the street 
or missing a movie that he or she wanted to 
see, respectively) or the presence or absence 
of losses (being stuck in a crowded subway 
for an extended period of time or getting 
a day off from a particularly arduous class 
schedule, respectively). Similar to the previ- 
ous study, individuals who were chronically 
promotion focused showed a stronger sen- 
sitivity and recall for gain-related informa- 


of eager strategic preferences, whereas in- 
dividuals who were chronically prevention- 
focused showed a stronger sensitivity and re- 
call for loss-related information that is more 
meaningful in the context of vigilant strate- 
gic preferences. 


Strategic Preferences and Regulatory Fit 


Although the studies presented thus far 
have demonstrated how people’s motiva- 
tional orientations can lead them to pre- 
fer and choose certain judgment strategies, 
situations may exist in which they may be 
more or less able to follow these preferences. 
For example, some situations may gener- 
ally require greater use of eager strategies 
of pursuing gains or vigilant strategies of 
preventing mistakes such as when supervi- 
sors demand either innovative and creative 
practices of all their employees in search 
of advancement or cautious and responsi- 
ble practices in hope of preventing losses. 
What might be the consequences of making 
judgments and decisions in a way that ei- 
ther suits one’s current strategic preferences 
(i.e., promotion-focused individuals using 
eager strategies and prevention-focused in- 
dividuals using vigilant strategies) or does 
not suit one’s preferences (i.e., promotion- 
focused individuals using vigilant strategies 
and prevention-focused individuals using ea- 
ger strategies)? 

Higgins and colleagues have examined 
this question and investigated how the regu- 
latory fit between one’s motivational orien- 
tation and the means one uses during goal 
pursuit affects thinking and reasoning (e.g., 
Camacho et al., 2003; Freitas & Higgins 
2002; Higgins, Idson, Freitas, Spiegel, & 
Molden, 2003). Although space limitations 
prohibit a more thorough review of this 
work here (see Higgins, 2000a; Higgins & 
Molden, 2003), the general findings have 
been that that the primary consequence of 
regulatory fit is to increase the perceived 
value of the goal one is pursuing. That 
is, regulatory fit (as compared with nonfit) 
leads people to “feel right” about their goal 
pursuit, which then leads them to (1) feel 
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feels right feels good; see Freitas & Higgins, 
2002); (2) experience the outcomes they are 
striving for as having more value or worth 
(i.e, what feels right is good; see Higgins 
et al., 2003); and (3) believe the strate- 
gies they are using are inherently right (i.e., 
what feels right is right; see Camacho et al., 
2003). Therefore, another avenue for fu- 
ture research on how people’s motivations 
to use certain judgment strategies can af- 
fect their thought processes is the further 
refinement and elaboration of the process of 
regulatory fit. 


Conclusions on Strategy-Motivated 
Thinking 


In sum, several emerging programs of re- 
search are beginning to demonstrate that, 
beyond the effects on reasoning of people’s 
desires for particular judgment outcomes, 
there are additional effects on reasoning of 
people’s desires to use particular judgment 
strategies. For example, preferences for ea- 
ger judgment strategies, shown by those with 
promotion concerns, versus preferences for 
vigilant judgment strategies, shown by those 
with prevention concerns, alter many basic 
cognitive processes during reasoning. These 
include (1) the generation and testing of hy- 
potheses, (2) the use of counterfactual think- 
ing, (3) an emphasis on fast versus accurate 
processing of information, and (4) knowl- 
edge activation and recall. Strategy motiva- 
tion effects include whether cognitive pro- 
cesses are implemented in order to advance 
the right decision and avoid errors of omis- 
sion in judgment or to protect against the 
wrong decision and avoid errors of commis- 
sion in judgment. They also include whether 
such implementation fits or does not fit one’s 
current motivational orientation. The imple- 
mentation of cognitive processes for either 
of these strategic reasons or for regulatory 
fit influences what pieces of information are 
considered during judgment and how much 
this information is valued in a final decision. 
In this way, then, people’s strategic motiva- 
tions have important effects on their think- 


outcome motivations. 


General Conclusions and Future 
Directions 


The sheer number and diversity of the stud- 
ies reviewed here is a testament to the return 
of motivational perspectives on cognition 
to the vanguard of psychology. The rich- 
ness and consistency of the findings emerg- 
ing from these studies is also a testament to 
the utility of this perspective in the study 
of thinking and reasoning. We optimistically 
forecast a further expansion of research in- 
formed by motivational perspectives and, in 
conclusion, briefly outline two general di- 
rections we believe should be priorities for 
the future. 

The first direction involves expanding 
current conceptualizations of the ways in 
which motivational and cognitive processes 
interact during judgment. Although there 
is still much to be learned from examining 
the effects on thinking of people’s motiva- 
tions for certain outcomes (either directional 
or nondirectional), there may potentially 
be other important sources of motivated 
thought as well. In this chapter, we reviewed 
our own initial research on one of these 
possible sources — people’s motivations for 
employing preferred strategies during judg- 
ment. We expect that further study will lead 
to the development of additional perspec- 
tives on the interface of motivation and cog- 
nition that go beyond both motivated out- 
comes and motivated strategies. 

The second direction involves moving 
past research that examines different va- 
rieties of motivated thinking in isolation 
from one another (i.e., studying situations in 
which people are only motivated to achieve 
positive self-views or only motivated to be 
accurate). There is a need to consider how 
multiple goals, desires, and motives inter- 
act to influence the thought process — that 
is, the effects of patterns of motivational 
forces. For instance, it has been noted for 
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potential objectives when processing in- 
formation (eg., Chen & Chaiken, 1999). 
Although it is certainly the case that, at 
times, objectives such as accuracy, ingrati- 
ation, or self-enhancement may be predom- 
inant (Kruglanski, 1999), it is also true that 
there are many instances in which several 
of these objectives are pursued simultane- 
ously. What happens when people not only 
want to be accurate but also want to please 
others or boost their own self-esteem? Stud- 
ies addressing these questions are just be- 
ginning to appear, and early findings are 
indicating that important interactions can 
occur (Lundgren & Prislin, 1998; Nienhuis, 
Manstead, & Spears, 2001; Ruscher, Fiske, & 
Schnake, 2000). 

Similarly, although we have made a dis- 
tinction between outcome- and strategy- 
motivated thinking and discussed their ef- 
fects independently, there are situations in 
which these two sources of motivation op- 
erate in concert. One of these situations 
has been the focus of recent studies by 
Molden and Higgins (2004). These studies 
examined how preferences for eager ver- 
sus vigilant decision strategies influence peo- 
ple’s generation of alternative explanations 
for their own success and failure. In addi- 
tion to replicating both the previously dis- 
cussed self-serving pattern of attributions 
for performance (an outcome-motivated ef- 
fect) and the selection of a greater number 
of alternative attributions by those prefer 
ring eager strategies over vigilant strategies 
(a strategy-motivated effect), these studies 
showed that self-serving and strategic moti- 
vations interacted to determine the extent to 
which people generalized their current ex- 
periences to their future performance. Indi- 
viduals using eager strategies, because they 
tended to consider multiple attributions, in- 
cluding both internal and external causes, 
showed only moderate generalization after 
both success and failure. In contrast, indi- 
viduals using vigilant strategies, because they 
tended to consider only a few attributions, 
including primarily internal causes follow- 
ing success but external causes following 


lowing success and almost no generaliza- 
tion after failure. These results demonstrate 
the importance of considering the effects of 
multiple sources of motivated reasoning si- 
multaneously (see also Férster, Higgins, & 
Strack, 2000). 

One final way in which investigating the 
cognitive effects of interacting motivational 
forces could be fruitfully expanded is by 
synthesizing work on how motivation influ- 
ences reasoning with work on how affect in- 
fluences reasoning (see Forgas, 2000; Mar- 
tin & Clore, 2001). Great strides have been 
made in determining the mechanisms by 
which affective and emotional states can al- 
ter people’s judgments. Many of the changes 
in the quality and quantity of information 
processing found in this research bear a strik- 
ing resemblance to the motivational effects 
reviewed here. For example, positive moods 
have generally been found to support less 
thorough and complex information process- 
ing, similar to closure motivation, whereas 
negative moods have generally been found to 
support more thorough and complex infor- 
mation processing, similar to accuracy mo- 
tivation (for a review, see Schwarz & Clore, 
1996). This is not to say, however, that the 
effects reviewed here are actually just due 
to changes in emotion, because many of the 
studies discussed carefully controlled for af- 
fective influences and continued to find in- 
dependent effects. Therefore, it would be 
fruitful to investigate how affective think- 
ing may give rise to motivational thinking 
(eg., Erber & Erber, 2000), and how mo- 
tivational thinking may give rise to affec- 
tive thinking (e.g., Higgins, 2000b), in or 
der to develop a better understanding of 
how these two factors are related and what 
their combined and separate consequences 
might be. 

In conclusion, this chapter reviewed re- 
search that displays the broad applicability 
of emerging motivational perspectives to the 
study of thinking and reasoning. Through 
this review, we attempted to convey the po- 
tential utility of these perspectives and to ad- 
vocate a greater incorporation of principles 
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in future research. The further refinement 
and elaboration of these principles, we be- 
lieve, will benefit not only the study of think- 
ing but also cognitive science in general. 
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Notes 


1. One area of study that is notably absent in 
this review concerns affective and emotional 
influences on reasoning. This important and 
extensive literature certainly enjoys a central 
place in the study of motivated thinking. How- 
ever, the topic of affect and cognition has re- 
cently been the subject of several entire hand- 
books on its own (see Forgas, 2000; Martin 
& Clore, 2001). Therefore, rather than at- 
tempt an extremely limited overview of this 
major topic alongside the other topics men- 
tioned previously, we instead refer the inter- 
ested reader to these other sources. The larger 
relation between research on emotional think- 
ing and the research described here is discussed 
briefly below. 


2. It is important to note that, although a wealth 
of studies have demonstrated people’s broad 
and robust desires for positive self-evaluation, 
these studies have almost exclusively been per- 
formed on members of Western, and gener- 
ally more individualistic cultures (Baumeister, 
1998). In contrast, recent evidence collected 
from Eastern, and generally more collectivist 
cultures, has demonstrated that, in these pop- 
ulations, such desires for self-evaluation are of- 
ten considerably less and that some of the ef- 
fects described here are thereby weaker (see 
Greenfield, Chap. 27). Yet, this should not 
be taken to mean that the general effects 
of outcome-motivated thinking are necessar- 
ily culture specific or only apply to West- 
ern cultures. Instead, this indicates that, if 
general principles of this type of motivated 


tions of outcome-motivated thinking in dif- 
ferent cultures should take care to identify 
which specific outcomes are culturally desirable 
in those contexts (e.g., proper fulfillment of 
one’s social duties to others, high social sta- 
tus relative to others; see, e.g., Endo, Heine, & 
Lehman, 2000). 


. Another type of nondirectional outcome mo- 


tivation that has been the focus of considerable 
study is the need for cognition, or a general desire 
for elaborative thinking and increased cogni- 
tive activity (Cacioppo et al., 1996). At times, 
the need for cognition has been considered 
equivalent to accuracy motivation (Chen & 
Chaiken, 1999). Consistent with this, research 
has shown that an increased need for cognition 
can affect thinking in the same way as height- 
ened accuracy motivation, reducing biases dur- 
ing attribution (D’agostino & Fincher-Kiefer, 
1992), increasing recall (Srull et al., 1985), less- 
ening assimilation to highly accessible attitudes 
(Florack, Scarabis, & Bless, 2001), and increas- 
ing information search (Verplanken, 1993; see 
Cacioppo et al., 1996). However, at times the 
effects of the need for cognition differ from 
those of accuracy motivation. Accuracy moti- 
vation, because it inspires a thorough consid- 
eration of all available evidence, weakens the 
tendency to base judgments on early superfi- 
cial impressions (i.e., primacy effects; Kruglan- 
ski & Freund, 1983). In contrast, the need 
for cognition, because it simply inspires cog- 
nitive elaboration even if this involves only 
part of the available evidence, can lead to in- 
creased rumination on one’s early superficial 
impressions and strengthen primacy effects (see 
Petty & Wegener, 1999). Given these concep- 
tual and empirical distinctions, we have not in- 
cluded research on need for cognition in our 
larger review of the effects of accuracy moti- 
vation and consider it a separate form of nondi- 
rectional outcome motivation (for a review 
of need for cognition effects, see Cacioppo 
et al., 1996). 
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CHAPTER 14 


Problem Solving 


Laura R. Novick 
Miriam Bassok 


Introduction 


People are confronted with problems on a 
daily basis such as extracting a broken light 
bulb from a socket, multiplying eight times 
seven, finding the roots of a quadratic equa- 
tion, planning a family vacation, and de- 
ciding whom to vote for in a presidential 
election. Although these examples differ in 
many ways, they share a common core: “A 
problem arises when a living creature has 
a goal but does not know how this goal 
is to be reached. Whenever one cannot go 
from the given situation to the desired sit- 
uation simply by action [i.e., by the perfor 
mance of obvious operations], then there has 
to be recourse to thinking” (Duncker, 1945, 
p. 1). Consider the broken light bulb. The 
obvious operation — holding the glass part 
of the bulb with one’s fingers while unscrew- 
ing the base from the socket —is prevented by 
the fact that the glass is broken. Thus, there 
must be “recourse to thinking” — for example, 


one might try mounting half a potato on the 
broken bulb. 


A little thought concerning the light bulb 
situation, as well as our other examples, re- 
veals that what constitutes a problem for one 
person may not be a problem for another 
person, or for that same person at another 
point in time. For example, the second time 
one has to remove a broken light bulb from 
a socket, the solution likely can be retrieved 
from memory; there is no problem. Simi- 
larly, 8 x 7 would generally be considered 
a problem for 8-year-olds but not for read- 
ers of this chapter. Of course, age here is 
just a proxy for prior knowledge, for there 
are 6-year-olds for whom this question does 
not constitute a problem because they know 
the standard multiplication table. Given that 
a problem has been identified, the nature 
of people’s background knowledge pertain- 
ing to that problem has important implica- 
tions for the solution-related thinking they 
do. To understand this thinking, it is impor- 
tant to distinguish (1) the solver’s represen- 
tation of the problem (i.e., the solver’s un- 
derstanding of the underlying nature of the 
problem) and (2) the sequence of steps the 
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the goal. 

A problem representation is a model 
of the problem constructed by the solver 
to summarize his or her understanding of 
the problem’s essential nature. Ideally, this 
model includes information about the goal, 
the objects and their interrelations, the op- 
erations that can be applied (i.e., the steps 
that can be taken) to solve the problem, and 
any constraints on the solution process. Con- 
sider, for example, Posner’s (1973, pp. 150- 
151) trains and bird problem: 


Two train stations are fifty miles apart. At 
2 P.M. one Saturday afternoon two trains 
start toward each other, one from each sta- 
tion. Just as the trains pull out of the sta- 
tions, a bird springs into the air in front of 
the first train and flies ahead to the front 
of the second train. When the bird reaches 
the second train it turns back and flies to- 
ward the first train. The bird continues to 
do this until the trains meet. If both trains 
travel at the rate of twenty-five miles per 
hour and the bird flies at a hundred miles 
per hour, how many miles will the bird have 
flown before the trains meet? 


Figure 14.1 shows two different representa- 
tions of this problem that imply different 
solution methods. Solver A [Figure 14.1(a)] 
represents the problem as one concerning 
the ongoing flight path of the bird, which 
is the focus of the problem as presented. 
This perspective yields a problem that would 
be difficult for most people to solve (eg., 
a series of differential equations). In con- 
trast, solver B [Figure 14.1(b)] represents the 
problem from the perspective of the paths of 
the trains. This perspective yields a relatively 
easy distance-rate-time problem. To take an- 
other example, the problem 14 x 8 might 
be represented as 8 groups of 14 or as 10 
groups of 8 plus 4 groups of 8 (or in a variety of 
other ways). 

For some problems, the primary work of 
solution is to find the best representation; 
for other problems, there is little uncer- 
tainty about the representation, and the pri- 
mary work is to discover a solution path 
(or the best solution path) from the initial 
state of the problem (the situation as initially 


Consider, for example, the Tower of Hanoi 
problem: There are three pegs mounted on 
a base. On the leftmost peg, there are three 
disks of differing sizes. The disks are arranged 
in order of size with the largest disk on the 
bottom and the smallest disk on the top. 
The disks may be moved one at a time, 
but only the top disk on a peg may be 
moved, and at no time may a larger disk be 
placed on a smaller disk. The goal is to move 
the three-disk tower from the leftmost peg 
to the rightmost peg. Figure 14.2 shows all 
the possible legal arrangements of disks on 
pegs. The arrows indicate transitions be- 
tween states that result from moving a sin- 
gle disk. The shortest path that connects the 
initial state to the goal state (i.e., the opti- 
mum solution) is indicated by the thicker 
grey arrows. 

Researchers who study problem solving 
present people with various types of prob- 
lems for which those people do not have a 
prestored solution in memory and attempt 


(a) A representation focused on the bird. 


© Too 
Station 1 50 miles Station 2 


(b) A representation focused on the trains. 


twat 
25 miles 25 miles 
Station 1 50 miles Station 2 


Figure 14.1. Alternative representations of 
Posner’s (1973) trains and bird problem. (From 
“Transferring symbolic representations across 
non-isomorphic problems,” by L. R. Novick & 
C. E. Hmelo, 1994, Journal of Experimental 
Psychology: Learning, Memory, and Cognition, 20, 
p. 1297. Copyright 1994 by the American 
Psychological Association. Adapted with 
permission.) 
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Figure 14.2. All possible problem states for the three-disk Tower of Hanoi problem. The thicker 
grey arrows show the optimum solution path connecting the initial state (state #1) to the goal state 


(state #27). 


to find regularities in the resulting problem- 
solving behavior. For example, Greeno 
(1978) distinguished problems of inducing 
structure [e.g., proportional analogies such 
as those found on standardized tests — for ex- 
ample, bird:fly::snake:?? (solution is slither), 
transformation (eg., the Tower of Hanoi), 
and arrangement [e.g., anagrams — for ex- 
ample, unscramble dnsuo to form an English 
word (solution is sound )], and discussed 
the processes required to solve problems of 
each type. Regardless of the specific prob- 
lem type, problem-solving behavior involves 
an inherent interaction between construct- 
ing a representation and generating a solu- 
tion. However, some researchers are most in- 
terested in factors that affect the way solvers 
represent problems, whereas others look for 
regularities in the way solvers apply opera- 
tors to get from the initial state to the goal 
state. Based on their main focus of interest, 
researchers devise or select problems that are 


likely to induce distinct representations (e.g., 
the trains and bird problem, problems of in- 
ducing structure) or to require repeated se- 
lection and application of operators within 
a particular problem representation (eg., 
the Tower of Hanoi and other problems of 
transformation, problems of arrangement). 
This division of labor, with its distinct his- 
toric antecedents and research traditions, has 
led to many interesting findings. We review 
the main findings from each tradition and 
then review results from studies that high- 
light the interaction between how people 
understand problems and how they derive 
problem solutions. 

The remainder of this chapter is orga- 
nized into five sections. First, we provide 
a brief historic perspective on problem- 
solving research. Next, we summarize re- 
search on the step-by-step process of gener- 
ating problem solutions. In the third section, 
we describe a variety of factors that affect 
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sider the interplay between constructing a 
representation and generating a solution. Fi- 
nally, we draw some conclusions and con- 
sider directions for future research. Our re- 
view focuses on general findings that per- 
tain to a wide variety of problems. Research 
on specific types of processes that are in- 
volved in problem solving and on problem 
solving in particular content domains may 
be found elsewhere in this volume: induc- 
tion (see Sloman & Lagnado, Chap. 5); anal- 
ogy (see Holyoak, Chap. 6); causal learning 
(see Buehner & Cheng, Chap. 7); deductive 
reasoning (see Evans, Chap. 8); and problem 
solving in law (see Ellsworth, Chap. 28), sci- 
ence (see Dunbar & Fugelsang, Chap. 29), 
and medicine (see Patel, Arocha, & Zhang, 
Chap. 30). 


A Brief History 


Research on human problem solving has 
its origins in Gestalt psychology, an in- 
fluential approach in European psychol- 
ogy during the first half of the twentieth 
century. (Behaviorism was the dominant 
perspective in American psychology at this 
time.) Karl Duncker published a book 
on the topic in his native German in 
1935, which was subsequently translated 
into English and published 10 years later 
as the monograph “On problem-solving” 
(Duncker, 1945). Max Wertheimer also pub- 
lished a book on the topic in 1945, titled 
“Productive thinking.” An enlarged edition 
published posthumously includes previously 
unpublished material (Wertheimer, 1959). 
Interestingly, 1945 seems to have been a wa- 
tershed year for problem solving, for math- 
ematician George Polya’s book, “How to 
solve it,” also appeared then. (A second edi- 
tion was published 12 years later; Polya, 
1957.) Extending the organizational princi- 
ples of perception to the domain of problem 
solving, the Gestalt psychologists empha- 
sized the importance of problem represen- 
tation — how people view, interpret, or 
organize the given information — distinguish- 


the process of generating a solution. The 
Gestalt psychologists documented the im- 
pact of changes in perspective on problem 
difficulty as well as the effects of extrane- 
ous assumptions and prior knowledge on the 
way people understand problems and, there- 
fore, generate problem solutions. 

The psychological study of human prob- 
lem solving faded into the background af- 
ter the demise of the Gestalt tradition, and 
problem solving was investigated only spo- 
radically until 1972, when Allen Newell and 
Herbert Simon’s “Human problem solving” 
(Newell & Simon, 1972) sparked a flurry 
of research on this topic. In contrast to 
the Gestalt psychologists, Newell and Simon 
emphasized the step-by-step process of 
searching for a solution path connecting 
the initial state to the goal state. Their re- 
search goal was to identify general-purpose 
strategies that humans use to solve a variety 
of problems. Newell and Simon and their 
colleagues were heavily influenced by the 
information-processing approach to cogni- 
tive psychology and by work in computer 
science on artificial intelligence. These in- 
fluences led them to construct the General 
Problem Solver (GPS), a computer program 
that modeled human problem solving (Ernst 
& Newell, 1969; Newell & Simon, 1972). 
A great strength of GPS was its ability to 
solve problems as different as the Tower of 
Hanoi problem and the construction of logic 
proofs with a single general-purpose strat- 
egy (means-ends analysis, which we discuss 
in “Generating Problem Solutions”). 

In the mid- to late 1970s, the role of 
background knowledge became an impor- 
tant research topic in cognitive psychology, 
particularly in the area of text comprehen- 
sion (e.g., Anderson, Reynolds, Schallert, & 
Goetz, 1977; Bransford & McCarrell, 1974). 
In the field of problem solving, researchers 
recognized that a fundamental weakness 
of GPS was its lack of domain knowl- 
edge. For every problem type, the general- 
purpose strategy had to be supplemented 
with domain-specific knowledge. Moreover, 
research on expertise in knowledge-rich 
academic domains, such as mathematics, 
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ing the late 1970s and early 1980s, made 
clear the necessity of taking domain knowl- 
edge into account for understanding prob- 
lem solving. This research on expertise (e.g., 
Chi, Feltovich, & Glaser, 1981; Silver, 1979, 
1981) provided empirical evidence for asser- 
tions first made by Duncker decades earlier: 
In his discussion of expertise differences in 
the domain of mathematics, Duncker (1945, 
p. 110) noted that “with ‘poor’ mathemati- 
cians, the thought-material is from the very 
beginning more thoroughly imbued with 
perceptual functions. For the ‘good’ mathe- 
matician, on the other hand, there remains a 
more abstract stratum...in which only the 
specific mathematical properties still exist” 
(italics removed). 

It is perhaps inevitable that the two tra- 
ditions in problem-solving research — one 
emphasizing representation and the other 
emphasizing the process of generating a so- 
lution — would eventually come together. 
Although no single publication can be cred- 
ited for the rapprochement, one impetus for 
a blending of the two traditions was the re- 
alization that background knowledge plays a 
critical role in problem solving. In particular, 
differences in background knowledge called 
attention to the interdependence between 
the representation constructed and the solu- 
tion method employed, for solvers who con- 
structed different representations were ob- 
served to generate the solution in different 
ways. Figure 14.1 provides a clear example 
of this interdependence for the trains and 
bird problem. To take another example, the 
8-year-old son of one of the authors mentally 
represented the verbally stated multiplica- 
tion problem “sixty-seven times ninety-five” 
as (60 x 95) + (7 x 95) and then proceeded 
to mentally execute the indicated arithmetic 
operations to get the answer. In contrast, 
most people would represent this problem 
as 67 groups of 95 and turn to paper and pen- 
cil to compute the answer using the standard 
multiplication algorithm (given the absence 
of a calculator). The structure of this chapter 
aims to capture the evolution of research in 
the field of problem solving: from research 
on general principles of representation and 


portance of domain-specific knowledge, and 
from research that separates issues of repre- 
sentation and solution generation to a focus 
on their interaction. 


Generating Problem Solutions 


Algorithmic Versus Heuristic 
Solution Strategies 


The step-by-step solution process is the se- 
quence of actions solvers take to find and ex- 
ecute a procedure for generating a solution 
to the problem as they understand it. Re- 
searchers who study solution processes have 
made a distinction between algorithmic and 
heuristic strategies. 

An algorithm is a procedure that is guar- 
anteed to yield the solution. One type of al- 
gorithm is a mathematical formula. For ex- 
ample, multiplying the length of the base 
of a rectangle times its height is guaranteed 
to yield the rectangle’s area. Similarly, the 
formula 


—b+ Vb? — 
aa (Eq. 14.1) 
is guaranteed to provide the roots of the 
quadratic equation 


2a 


aX*+bX+c=o. (Eq. 14.2) 


We discuss mathematical problem solving in 
some detail in “The Interplay Between Rep- 
resentation and Solution” (also see Gallistel 
& Gelman, Chap. 23). 

Another type of algorithm — exhaustive 
search — involves checking every possible 
move. For example, one could solve the 
Tower of Hanoi problem by exhaustively 
considering every possible move in Fig- 
ure 14.2. Similarly, one could solve a four- 
letter anagram (e.g., idrb) by systematically 
evaluating the 24 possible permutations 
of the given letters (the solution is bird). 
For problems with a large number of pos- 
sible states, however, exhaustive search is 
impractical or impossible. For example, if 
the task is to find all possible solutions of a 
five-letter anagram (e.g., ebrda forms bread, 
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would require examination of 120 letter or 
ders. More strikingly, consider the game of 
chess (Holding, 1985): White has 20 possible 
opening moves, to which black can respond 
in any of 20 ways. Thus, on the second turn, 
white may be confronted with any of 400 
possible board positions. After white’s third 
move, there are 7.5 million possible board 
positions; after black’s third move, there are 
225 million possible positions. For a game of 
average length, the number of possible posi- 
tions is approximately 10"7. 

Clearly, some method is needed to prune 
the number of possible moves to be con- 
sidered. Such pruning is necessary for hu- 
man solvers owing to the limited capacity 
of working memory; it is also necessary for 
computers when, as in chess, the number of 
possible states is extremely large. Heuristics 
are problem-solving strategies that accom- 
plish this goal. Although heuristics do not 
guarantee solution, they are highly likely to 
lead to success. For example, a good heuristic 
for solving anagrams, especially those with 
five or more letters (e.g., dsyha), is to con- 
sider letter pairs that commonly begin words 
of the given length (e.g., Ronning, 1965). 
This heuristic is useful because, by defini- 
tion, most words begin with common letter 
pairs. Application of this heuristic to the ex- 
ample should quickly lead to the solution, 
shady. That considering common initial let- 
ter pairs is a heuristic rather than an algo- 
rithm is nicely illustrated by a second ana- 
gram, uspyr, which cannot be solved by this 
strategy because it begins with an uncom- 
mon letter pair (the solution, syrup, is the 
only five-letter word in English that begins 
with sy; Novick & Sherman, 2004). 

A large body of literature has examined 
the heuristics that people use to generate 
problem solutions. Much of this research 
has focused on puzzle-like problems, such 
as the Tower of Hanoi, that require little 
domain-specific knowledge. These problems 
are useful because they enable researchers 
to focus their attention primarily on the 
process of generating solutions. Newell and 
Simon (1972) were the pioneers in this area 
of research. In the next section, we discuss 


described as a process of heuristic search 
within a specific type of representation, and 
we consider in some detail two important 
search heuristics: hill climbing and means- 
ends analysis. 


Problem Solving as Search Through 
a Problem Space 


Newell and Simon (1972) wrote a magnum 
opus detailing their theory of problem solv- 
ing and presenting several lines of supporting 
evidence. Because their goal was to develop 
a theory to encompass all human problem 
solving, they emphasized what is common 
across the diversity of problems and problem 
solvers. Their fundamental proposal was that 
problem solving could be conceptualized as 
a process of searching through a problem 
space for a path connecting the initial state 
of knowledge (the solver’s understanding of 
the given information) to the goal state (the 
desired solution). 

Problem space is the term Newell and 
Simon (1972) coined to refer to the solver’s 
representation of the task as presented (also 
see Simon, 1978). Briefly, a problem space 
consists of a set of knowledge states (the ini- 
tial state, the goal state, and various possible 
intermediate states), a set of operators that 
allow movement from one knowledge state 
to another, and local information about the 
path one is taking through the space (e.g., 
the current knowledge state and how one 
got there). For the three-disk Tower of Hanoi 
problem, the initial state is illustrated at the 
top of Figure 14.2 (state #1), and the goal 
state is illustrated at the bottom right of that 
figure (state #27). All other knowledge states 
shown in the figure are possible intermedi- 
ate states. The current knowledge state is 
the one at which the solver is located at any 
given point in the solution process. For ex- 
ample, the current state for a solver who has 
made three moves along the optimum so- 
lution path would be state #9. The solver 
presumably would know that he or she ar- 
rived at this state from state #5. This knowl- 
edge allows the solver to recognize a move 
that involves backtracking. Finally, the three 
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of the three disks from one peg to another. 
These operators are subject to the constraint 
that a larger disk may not be placed on a 
smaller disk. 

Newell and Simon’s (1972) primary fo- 
cus of investigation was the strategies solvers 
use to find a path connecting the initial state 
to the goal state. That is, they sought to 
discover regularities in how solvers search 
through a problem space. In a nutshell, 
search is a serial method for making incre- 
mental progress toward the goal by applying 
operators to move from one knowledge state 
to another adjacent knowledge state. Newell 
and Simon discovered that, for a wide vari- 
ety of problems, solvers’ search is guided by 
a small number of heuristics. 

To investigate these heuristics, Newell 
and Simon (1972) relied on two primary 
methodologies — think-aloud protocols (also 
see Duncker, 1945) and computer simula- 
tion. Solvers were required to say out loud 
everything they were thinking as they solved 
the problem — that is, everything that went 
through verbal working memory. Subjects’ 
verbalizations — their think-aloud protocols — 
were tape-recorded and then transcribed 
verbatim for analysis. This method is ad- 
vantageous for studying problem solving be- 
cause it provides a detailed record of the 
solver’s ongoing solution process. An im- 
portant caveat that must be kept in mind 
while interpreting a subject’s verbalizations 
is that “a protocol is relatively reliable only 
for what it positively contains, but not for 
that which it omits” (Duncker, 1945, p. 11). 
The use of think-aloud protocols to study 
problem solving was popularized by Newell 
and Simon. Ericsson and Simon (1980) pro- 
vided an in-depth discussion of the condi- 
tions under which this method is valid (but 
see Russo, Johnson, & Stephens, 1989, for 
an alternative perspective). To test their in- 
terpretation of a subject’s verbal protocol, 
Newell and Simon created a computer simu- 
lation that was intended to solve the prob- 
lem the same way the subject did. To the ex- 
tent that the computer simulation provided 
a close approximation of the solver’s step- 
by-step solution process, the interpretation 
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Figure 14.3. Problem states on the solution 
path for the Hobbits and Orcs problem. Each H 
represents a Hobbit, each O represents an Orc, 
and the b represents the boat. The two 
horizontal lines indicate the banks of the river. 
State #1 is the initial state, and state #14 is the 
goal state. 


may be judged useful. Lovett and Anderson 
(Chap. 17) provide an in-depth treatment of 
computer models of thinking. 


HILL CLIMBING 


Hill climbing is a heuristic in which, at each 
step, the solver applies the operator that 
yields a new state that appears to be most 
similar to the goal state. This heuristic can be 
used whenever solvers can define an evalua- 
tion function that yields information about 
the similarity of the problem state gener- 
ated by a candidate operator to the goal 
state. For example, Chronicle, MacGregor, 
and Ormerod (2004) found evidence that 
subjects use hill climbing to solve various 
problems in which a set of coins has to be re- 
arranged from one configuration to another. 
We illustrate this heuristic using an exam- 
ple of a river-crossing problem (Figure 14.3), 
one of the classic problem types in the 
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and a boat on one side of a river (state #1). 
The goal is to use the boat, which has a ca- 
pacity of only two creatures, to ferry all the 
creatures across the river (state #14). At no 
time may Orcs outnumber Hobbits on either 
side of the river because they will eat the 
Hobbits. The solution path for this problem 
is essentially linear, as shown in Figure 14.3. 

From the initial state, there are two le- 
gal moves available — ferrying two Orcs or 
one Orc and one Hobbit across the river. 
Both moves yield new states that are equally 
similar to the goal state, and so either may 
be chosen. Use of the hill-climbing heuristic 
proceeds smoothly for the most part until 
the solver reaches state #7 in which there 
is one Hobbit and one Orc on the original 
side of the river; the boat and the remaining 
creatures are on the other (goal) side. The 
correct move at this point, in fact the only 
nonbacktracking move, is for one Hobbit and 
one Orc to take the boat back to the original 
side of the river. Thomas (1974) and Greeno 
(1974) found that solvers have particular dif- 
ficulty moving from state #7 to state #8: 
Both the probability of making an incorrect 
move and the time taken to make a move are 
quite large for this transition compared with 
other transitions. According to Wickelgren 
(1974), this difficulty occurs for either of 
two reasons. For solvers who evaluate their 
progress one move at a time, this transition is 
problematic because one must detour more 
than usual by taking two creatures back to 
the original side of the river (logically, only 
one creature is needed to get the boat back 
to the original side). For solvers who evalu- 
ate their progress two moves at a time (i.e., 
round trips of the boat from the original 
side back to the original side), this transi- 
tion is problematic because it results in no 
net progress toward the goal compared with 
state #6. 

The difficulty solvers encounter in mov- 
ing from state #7 to state #8 illustrates 
the primary drawback of the hill-climbing 
heuristic: Sometimes one needs to move ei- 
ther backward or laterally to move forward. 
Climbing a mountain can rarely be accom- 
plished solely by following the strategy of 


Sometimes one needs to walk downhill for a 
while to achieve the ultimate goal of reach- 
ing the mountain top. 


MEAN-ENDS ANALYSIS 


Means-ends analysis is a more sophisticated 
heuristic than hill climbing because it does 
not depend on simple similarity to the goal. 
This heuristic consists of the following steps: 


1. Identify a difference between the cur- 
rent state and the goal (or subgoal) state. 
2. Find an operator that will remove (or re- 
duce) the difference. 
3a. If the operator can be directly applied, 
do so, or 
3b. If the operator cannot be directly ap- 
plied, set a subgoal to remove the obsta- 
cle that is preventing execution of the 
desired operator. 
4. Repeat steps 1 to 3 until the problem is 
solved. 


We illustrate this heuristic with the Tower 
of Hanoi problem. A key difference be- 
tween the initial state and the goal state 
(Figure 14.2) is that the large disk is on 
the wrong peg (step 1). The move-large-disk 
operator is required to remove this differ- 
ence (step 2). However, this operator can- 
not be applied because of the presence of 
the medium and small disks on top of the 
large disk. Therefore, the solver may set a 
subgoal to move that two-disk tower to the 
middle peg (step 3b), thereby leaving the 
right peg free for the large disk. A key dif 
ference between the initial state and this 
new subgoal state is that the medium disk 
is on the wrong peg. Because application of 
the move-medium-disk operator is blocked, 
the solver sets another subgoal to move 
the small disk to the right peg. This sub- 
goal can be satisfied immediately by apply- 
ing the move-small-disk operator (step 3a), 
generating state #3. The solver then re- 
turns to the previous subgoal — moving the 
tower consisting of the small and medium 
disks to the middle peg. The differences be- 
tween the current state (#3) and the subgoal 
state (#9) can be removed by applying first 
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state #5) and then the move-small-disk op- 
erator (yielding state #9). Finally, the move- 
large-disk operator is no longer blocked. The 
solver takes that action, moving the large 
disk to the right peg, yielding state #11. No- 
tice that the subgoals are stacked up in the 
order in which they are generated so they 
pop up in the order of last in first out. Given 
the first subgoal in our example, repeated ap- 
plication of the means-ends analysis heuris- 
tic will yield the shortest-path solution indi- 
cated by the thick grey arrows. 

The key difference between hill climbing 
and mean-ends analysis is the online gen- 
eration of subgoals in the latter heuristic. 
Adding new subgoals during problem solv- 
ing greatly increases the power of heuristic 
search. Subgoals provide direction, and to 
the extent that they are appropriate, they 
can be expected to prune the space of pos- 
sible states. Moreover, by assessing progress 
toward a required subgoal rather than the fi- 
nal goal, solvers may be able to make moves 
that otherwise seem unwise. To take a con- 
crete example, consider the transition from 
state #1 to state #3 in Figure 14.2. Compar- 
ing the initial state with the goal state, we 
find that this move seems unwise because 
it places the small disk on the bottom of 
the right peg, whereas it ultimately needs 
to be at the top of the tower on that peg. 
However, if one compares the initial state 
with the solver-generated subgoal state of 
having the medium disk on the middle peg, 
this is exactly where the small disk needs to 
go. More generally, generating subgoals al- 
lows solvers to plan several moves ahead. 
(Duncker, 1945, also talked about the im- 
portance of subgoals.) 

As we noted in our brief historic review, 
means-ends analysis is the heuristic that GPS 
used to successfully model human problem 
solving across a wide variety of tasks (Ernst 
& Newell, 1969; Newell & Simon, 1972). A 
large body of research has found that mean- 
ends analysis tends to be people’s preferred 
solution method for novel problems that are 
relatively free of specialized content and for 
which a definite goal is given (Greeno & 
Simon, 1988) — for example, the Tower of 


of finding the roots of a quadratic equation 
or of unscrambling an anagram. 


Some Conclusions from Research 
on Problem Solving as Search 


Newell and Simon’s (i972) goal was to 
discover general problem-solving strategies 
that are common across problem solvers 
and across problems. One important con- 
tribution of their work concerns the meth- 
ods they adopted for studying this issue. 
Duncker (1945) was an early advocate of col- 
lecting think-aloud protocols, and he used 
this methodology very successfully to study 
problem solving. With the rise to dominance 
of behaviorism and the fall of the Gestalt ap- 
proach to psychology, however, this method- 
ology fell into disfavor. Newell and Simon 
(1972) brought a high degree of scientific 
rigor to the collection of verbal protocols, 
enabling this methodology to gain a degree 
of acceptance in the field that it did not 
previously enjoy. In addition, Newell and 
Simon were among the early pioneers in the 
use of computer simulation as a tool for 
testing theories of psychological processes. 
Both of these methods are now seen as ordi- 
nary rather than exotic means of investigat- 
ing problem solving (as well as other cogni- 
tive processes). 

Newell and Simon’s (1972) goal of un- 
covering general problem-solving strategies 
necessitated a focus on the solution of puz- 
zles such as the Tower of Hanoi and Hobbits 
and Orcs, which are relatively uncontami- 
nated by domain knowledge that necessar- 
ily varies across individuals. This focus was 
much like Ebbinghaus’ strategy of investigat- 
ing general principles of memory by study- 
ing nonsense syllables. Using this strategy, 
Newell and Simon and their colleagues made 
important contributions to the field of prob- 
lem solving: Means-ends analysis and other 
heuristics are very flexible and general strate- 
gies that people frequently use to success- 
fully solve a large variety of problems. 

Nevertheless, the view of problem solving 
as search through a problem space does not 
provide a complete understanding of how 
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on general-purpose search heuristics when 
they encounter novel problems, because 
these heuristics are weak and fallible, they 
abort them as soon as they acquire some 
knowledge about the particular problem 
space. At that point, they switch to more 
specialized strategies (e.g., Anzai & Simon, 
1979). In general, whenever solvers have 
some relevant background knowledge, they 
tend to use stronger, albeit more narrowly 
applicable, domain-specific methods. The 
impact of learning and domain knowledge 
on strategy use led problem-solving re- 
searchers to turn their attention from the 
solution of knowledge-lean puzzles and rid- 
dles to problems that made connections to 
solvers’ background knowledge. This shift is 
analogous to memory and comprehension 
researchers’ switch from studying nonsense 
syllables to studying words, paragraphs, and 
stories in order to understand the role of 
prior knowledge in memory and compre- 
hension. As we noted in the introduction, 
background knowledge plays an important 
role in determining the representation a 
solver constructs for a problem, which, in 
turn, affects the processes the solver uses to 
generate a solution. In the next two sections, 
we focus on problem representation and the 
interplay between representation and solu- 
tion, respectively. 


Problem Representation 


Overview 


In problems such as the Tower of Hanoi and 
Hobbits and Orcs, all the problem compo- 
nents — the initial conditions, the goal, the 
means for generating and evaluating the so- 
lution, and the constraints — are well defined 
in the problem as presented. In most real- 
world problems, however, the solver has to 
define one or more of the problem compo- 
nents. For example, a person’s desire to cook 
a tasty dinner, a student’s aspiration to write 
a term paper that will earn a grade of “A,” 
and a young executive’s need to find suitable 
housing are all examples of ill-defined prob- 
lems (Reitman, 1965). In these problems, 


how to determine that the goal has been ac- 
complished. For example, what constitutes 
a tasty dinner, and how does one decide that 
a particular recipe is tasty enough? It seems 
obvious that a cook’s definition of the goal 
state will depend on his or her background 
knowledge. A poor graduate student might 
picture homemade pizza, a parent of young 
children might imagine lasagna, an Indian 
couple without children might think of spicy 
lamb vindaloo, and a gourmet cook might 
visualize beef Wellington. The tasty dinner 
problem is ill defined in other ways as well. 
The cook has to define the given informa- 
tion (only ingredients found at home or also 
those at the grocery store?), the operators 
(e.g., to bake or stir fry or simmer on the 
stove), and the constraints (e.g., time, cost, 
the differing tastes of adults and children). 

As we noted earlier, the Gestalt psychol- 
ogists focused their attention on the factors 
that affect how people define, understand, 
or represent problems. Greeno (1977), in 
specific counterpoint to Newell and Simon’s 
(1972) focus on problem solving as search, 
also highlighted the central importance of 
representation. More recently, researchers 
who have studied problem solving in par- 
ticular knowledge domains (e.g., mathemat- 
ics, physics, medical diagnosis) have also em- 
phasized the critical role of representation 
in successful problem solving. Their investi- 
gations have shown that various aspects of 
the problem situation, as well as people’s 
background knowledge, affect how people 
represent problems and, in turn, how they 
generate problem solutions. The trains and 
bird problem we discussed at the outset 
(Figure 14.1) provides an anecdotal exam- 
ple of the importance of the representation 
constructed for the ultimate success of one’s 
solution attempt. 

We stated informally at the outset that 
a problem representation is a model of the 
problem constructed by solvers to summa- 
rize their understanding of the problem’s 
essential nature. More specifically, a repre- 
sentation has four components (Markman, 
1999): (1) arepresented world — in this case, 
the description of the problem to be solved, 
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to be used to depict the objects and relations 
in the represented world, (3) a set of rules 
that map elements of the represented world 
to elements of the representing world, and 
(4) a process that uses the information in the 
representing world — in this case, to solve 
the problem. This last component high- 
lights the link between representation and 
solution: Without some process that uses the 
information in the representation for some 
purpose, the so-called representation has no 
symbolic meaning (i.e., it does not serve a 
representational function). 

The representation a solver uses to sup- 
port and guide problem solving can be either 
internal (residing in working memory) or ex- 
ternal (e.g., drawn on paper). In either case, 
the elements of the representing world may 
follow a variety of different formats. Some 
representations are best described as verbal 
or propositional or declarative. Others are 
pictorial or diagrammatic, such as a drawing 
of a pulley system, a matrix or network, anda 
bar or line graph (see Hegarty, Carpenter, & 
Just, 1991, for a discussion of types of di- 
agrammatic representations). Finally, some 
representations are “runnable” mental mod- 
els (e.g., a mental abacus — Stigler, 1984; a 
system of interlocking gears — Schwartz & 
Black, 1996). 

In the previous section of this chapter, 
we highlighted how solvers generate prob- 
lem solutions, leaving in the background the 
question of how they represent the informa- 
tion in the problem. In this section, we take 
the opposite perspective, highlighting the 
problem representations that solvers con- 
struct and leaving in the background the 
methods by which those representations are 
used to generate the solution. We consider 
solution only as a dependent measure (i.e., 
accuracy and/or solution time), illustrating 
that differences in problem representation 
affect problem solution. Our discussion of 
research in this area is organized around two 
classes of factors that have been found to af- 
fect the representation that solvers select or 
construct for the problem at hand — problem 
context and solver’s knowledge. In the next 
section of the chapter, we consider the inter- 


focusing there on studies showing that the 
representation one constructs for a problem 
affects how one generates the solution. 


The Importance of Problem Context 


A number of studies have found that various 
aspects of the problem context have a strong 
influence on the representations solvers con- 
struct. In this section, we describe three such 
studies, which illustrate three different types 
of problem context effects. The first study il- 
lustrates an effect of the perceptual form of 
the problem, the second study shows an ef- 
fect of semantic interpretation based on how 
objects are used, and the third study demon- 
strates an effect of the story content of the 
problem. 


PERCEPTUAL FORM 


Problems that are presented as visual dis- 
plays or diagrams may provide informa- 
tion about configuration that solvers deem 
relevant to the solution and include in 
their problem representation. This effect is 
nicely illustrated by Maier’s (1930) nine-dot 
problem: Nine dots are arrayed in a 3 x 
3 grid, and the task is to connect all the 
dots by drawing four straight lines without 
lifting one’s pencil from the paper. People 
have difficulty solving this problem because 
their initial representations generally include 
a constraint, inferred from the configuration 
of dots, that the lines cannot go outside the 
boundary of the imaginary square formed by 
the outer dots. With this constraint implied 
by the perceptual form of the dots, the prob- 
lem cannot be solved (but see Adams, 1979). 
Without this constraint, the problem may be 
solved as shown in Figure 14.4. 

The nine-dot problem is a classic insight 
problem. According to the Gestalt view 
(e.g., Duncker, 1945; Maier, 1931; see Ohls- 
son, 198 4a, for a review), the solution to an 
insight problem appears suddenly, accom- 
panied by an “aha!” sensation, immediately 
following the sudden restructuring of one’s 
understanding of the problem: “The decisive 
points in thought-processes, the moments of 
sudden comprehension, of the ‘Aha!,’ of the 
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in which such a sudden restructuring of 
the thought-material takes place” (Duncker, 
1945, p. 29). For the nine-dot problem, one 
view of the required restructuring is that the 
solver relaxes the constraint implied by the 
perceptual form of the problem and realizes 
that the lines in fact may extend past the 
boundary of the imaginary square. 

To test this view, in one experiment 
Weisberg and Alba (1981) compared the per- 
formance of control subjects who were given 
20 attempts to solve the nine-dot problem 
with that of other subjects who received 
10 attempts before a restructuring hint, fol- 
lowed by 10 attempts after the hint. The 
restructuring hint involved telling subjects 
that they had exhausted all possibilities in- 
side the square, and so they had to go outside 
the square to solve the problem. No sub- 
ject in either condition solved the problem 
in the first 10 tries, and no subject in the 
control condition ever solved the problem 
(excluding those who had seen the problem 
before). However, 20% of the restructuring 
hint group solved the problem in the sec- 
ond 10 tries. A follow-up study that gave 
subjects many more solution attempts repli- 
cated these results. Interestingly, solution 
was neither quick nor direct following the 
restructuring hint in either study, for sub- 
jects generally required 5 to 11 solution at- 
tempts after the hint before solving the prob- 
lem. Moreover, 75% to 80% of the subjects 
failed to solve the problem despite the hint. 
Thus, restructuring, as provided by Weisberg 
and Alba’s hint, appears to be necessary but 
not sufficient for solution. We reconsider the 
nature of insight in “The Interplay Between 
Representation and Solution.” 


OBJECT-BASED INFERENCES 


In addition to making inferences from the 
perceptual form ofa presented figure, solvers 
may draw inferences from the specific enti- 
ties that appear in a problem, and these in- 
ferences may likewise affect the constructed 
problem representation. A classic example 
of such inferences is the phenomenon of 
functional fixedness introduced by Duncker 


purpose, or is habitually used for a certain 
purpose, it is difficult to see that object as 
having properties that would enable it to 
be used for a dissimilar purpose. Duncker’s 
basic experimental paradigm involved two 
conditions that varied in terms of whether 
the object that was crucial for solution was 
initially used for a function other than that 
required for solution. 

Consider the candles problem, the most 
well-known of the five problems Duncker 
(1945) investigated. Three candles are to be 
mounted at eye height on a door. On the ta- 
ble for use in completing this task are some 
tacks and three boxes. The solution is to 
tack the three boxes to the door to serve 
as platforms for the candles. In the control 
condition, the three boxes were presented 
to subjects empty. In the functionally fixed 
condition, the three boxes were filled with 
candles, tacks, and matches. Thus, in the lat- 
ter condition, the boxes initially served the 
function of container, whereas the solution 
requires that they serve the function of plat- 
form. The results showed that 100% of the 
subjects who received empty boxes solved 
the candles problem compared with only 
43% of subjects who received filled boxes. 
Every one of the five problems showed a dif- 
ference favoring the control condition over 
the functionally fixed condition with aver- 
age solution rates across the five problems 
of 97% and 58%, respectively. In “The Inter- 
play between Representation and Solution” 
we discuss additional examples of object- 
based inferences that link semantic content 
to representation and then to the method of 
solution adopted. 


STORY CONTENT 


In our earlier discussion of the trains and bird 
problem, we mentioned that the text is writ- 
ten such that it invites the solver to focus 
on the motion of the bird [Figure 14.1(a)] 
rather than of the trains [Figure 14.1(b)]. In 
general, the story content and phrasing of 
the problem text may affect how the solver 
represents the problem. Hayes and Simon 
(1977; also see Kotovsky, Hayes, & Simon, 
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Figure 14.4. A solution to the nine-dot problem. 


1985) provided empirical evidence that dif- 
ferences in the descriptions of the operators 
in two isomorphic (i.e., structurally equiva- 
lent) problems yielded quite different rep- 
resentations with important consequences 
for the problems’ relative difficulty. They 
used several variants of the Tower of Hanoi 
problem that concerned monsters and globes 
that came in three sizes: small, medium, 
and large. We discuss one “transfer” variant 
and one “change” variant used in Hayes and 
Simon’s research. For both variants, the ini- 
tial state had the small monster holding the 
large globe, the medium-size monster hold- 
ing the small globe, and the large monster 
holding the medium-size globe. The goal 
was for each monster to have a globe pro- 
portionate to its own size. Both variants 
can be mapped onto the Tower of Hanoi 
problem states shown in Figure 14.2. If we 
map the small, medium, and large monsters 
onto the left, center, and right pegs, respec- 
tively, and map increasing globe size onto 
decreasing disc size, both monster variants 
are equivalent to the task of getting from 
state #12 to state #5 in Figure 14.2. 

The only difference between the two 
monsters and globes isomorphs concerned 
the description of the operators. In the trans- 
fer variant, subjects were told that the mon- 
sters could transfer the globes from one to 
another as long as they followed three rules: 
(1) Only one globe may be transferred at a 
time; (2) if a monster is holding multiple 


Mcomlobes, only the larger globe may be trans- 


ferred; and (3) a globe cannot be transferred 
to a monster that is holding a larger globe. 
In the change variant, subjects were told 
that the monsters could shrink and expand 
themselves according to the following rules: 
(2) Only one monster may change size at 
a time; (2) if two monsters are the same 
size, only the one holding the larger globe 
may change size; and (3) a monster may not 
change size so it becomes the same size as an- 
other monster that is holding a larger globe. 

Because these two problems are struc- 
turally identical, they can be solved by mak- 
ing the same sequence of moves in the same 
problem space. However, the subjects did 
not translate the problems to a common rep- 
resentation. Rather, they accepted the cover 
story as given and, depending on the vari- 
ant they received, proceeded to either move 
globes or change monster sizes. The differ- 
ent representations and operators adopted 
were apparent in the written notations pro- 
duced by subjects as they solved the prob- 
lem (Hayes & Simon, 1977). Importantly, 
the representation constructed had a large 
effect on solution time: The transfer vari- 
ant took about 14 minutes to solve com- 
pared with about 29 minutes for the change 
variant. The greater difficulty of the change 
variant is due to an additional step needed 
to check that the operator constraints have 
been satisfied. 


The Importance of Solvers’ Knowledge 


In the previous section, we discussed prob- 
lem factors that affect the representations 
solvers construct. However, the extent to 
which solvers respond to various problem 
factors depends on their prior experience 
and background knowledge. Consider, for 
example, the following mathematical word 
problem: “Susan has 12 cookies and three 
boxes. How many cookies should she place 
in each box in order to divide them up 
fairly?” A child who has sufficient experience 
with solving such problems is likely to repre- 
sent this problem in terms of its mathemati- 
cal structure — simple division. In contrast, 
a child who has never encountered such 
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of human motivation and behavior. This 
child might consider the size of the cook- 
ies and the boxes, or wonder who Susan is 
and why she wants to put the cookies in 
the boxes. In general, solvers’ background 
knowledge affects whether and to what ex- 
tent they focus their attention on problem 
aspects that are or are not relevant to deter- 
mining the solution. In this section, we dis- 
cuss three types of background knowledge 
that pertain to solvers’ understanding of the 
problem at hand. First, we consider solvers’ 
prior experience with a structurally similar 
or analogous problem. Second, we consider 
their generalized schemas for types of so- 
lution procedures as well as types of com- 
mon representational tools (e.g., matrices). 
Third, we consider differences in problem 
representation that are due to differences in 
solvers’ domain expertise. 


EXPERIENCE WITH A STRUCTURALLY SIMILAR OR 
ANALOGOUS PROBLEM 

A large body of research has examined peo- 
ple’s use of specific examples of problems 
to help them understand and solve a cur- 
rent problem. An example can be helpful 
for solving a novel problem only if the two 
problems have a similar underlying structure 
because a problem’s structure is what de- 
termines appropriate solution methods (e.g., 
division for the cookie problem). The ex- 
ample will not be helpful if the problems 
only share a similar cover story and involve 
similar objects (e.g., a person, cookies, and 
boxes) but differ in their underlying struc- 
ture (e.g., in the example Susan distributes 
cookies among boxes, but in the novel prob- 
lem Leah removes one cookie from each 
box). Research on analogical problem solv- 
ing (also referred to as analogical trans- 
fer) shows that solvers’ understanding, or 
representation, of a novel problem can be 
facilitated by prior experience with an anal- 
ogous (i.e., structurally equivalent) problem. 
However, people may fail to retrieve an anal- 
ogous problem from memory, or fail to ap- 
ply an analogous solution, if they focus their 
attention on the solution-irrelevant differ- 
ences between the example and the novel 


in-depth treatment of research on analogy. 
Here, we describe in detail only a single, now 
classic, study (Gick & Holyoak, 1980) that 
illustrates this line of research. 

Gick and Holyoak (1980) used Duncker’s 
(1945) radiation problem as their target 
(novel) problem. This problem involves 
finding a way to use some rays to destroy 
a patient’s stomach tumor, without harming 
the patient. At sufficiently high intensity, the 
rays will destroy the tumor. However, at that 
intensity they will also destroy the healthy 
tissue surrounding the tumor. At lower in- 
tensity, the rays will not harm the healthy 
tissue, but they also will not destroy the tu- 
mor. The desired solution is to project multi- 
ple low-intensity rays at the tumor from sev- 
eral points around the patient. The rays will 
converge on the tumor, where their individ- 
ual intensities will sum to a level sufficient to 
destroy the tumor. Baseline use of this con- 
vergence solution is quite low — about 10% 
(Gick & Holyoak, 1980). Gick and Holyoak 
examined whether solvers’ understanding of 
the radiation problem, as indexed by their 
use of the convergence solution, might be 
facilitated by prior exposure to an analogous 
situation. To this end, they had subjects at- 
tempt to solve the radiation problem after 
having previously read a story that described 
the following analogous situation: A general 
was trying to capture a fortress controlled by 
a dictator. Multiple roads led to the fortress 
from all directions. However, the roads were 
mined in such a way that large groups of sol- 
diers could not travel on them. The general 
decided to send a separate small group of sol- 
diers down each of the various roads so the 
full army would converge at the fortress. In 
this way, he was able to overthrow the evil 
dictator and capture the fortress. 

Gick and Holyoak (1980) found that sub- 
jects generally did not spontaneously notice 
that the story about the fortress was relevant 
to solving the radiation problem: Only about 
20% provided the convergence solution to 
that problem after having read the fortress 
story. However, when these same subjects 
were subsequently given a simple hint indi- 
cating that one of the stories they had read 
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diation problem, about 75% generated the 
convergence solution. These results indicate 
that solvers may fail to spontaneously notice 
the relevance of problems stored in mem- 
ory for understanding and solving a current 
problem, although they are able to use the 
prior problem appropriately when its rele- 
vance is highlighted. 

An important factor that mediates spon- 
taneous retrieval and use of analogous solu- 
tions is people’s understanding of the learn- 
ed example. Chi, Bassok, Lewis, Reimann, 
and Glaser (1989) investigated this issue 
in the domain of physics, using problems 
from elementary mechanics. They found 
that learners who understood the logic of 
textbook examples spontaneously applied 
the example problems’ solutions to analo- 
gous test problems that differed from the 
learned examples in many respects. How- 
ever, poor learners failed to recognize the 
structural similarity between the examples 
and the novel problems. People’s ability to 
exploit analogous solutions also depends on 
their domain expertise. We discuss expertise 
differences in problem representation after 
considering schematic knowledge. Then, in 
“The Interplay Between Representation and 
Solution” we consider the implications of ex- 
pertise differences for analogical transfer. 


GENERAL SCHEMAS IN MEMORY 


In addition to knowledge of specific prob- 
lems encountered in the past, solvers also 
have in memory abstract schemas for types 
of problems, types of solution procedures, 
and types of representations. These schemas 
are abstract in the sense that they include 
information that is common to multiple 
problems of a particular type but exclude 
information that is idiosyncratic to the in- 
dividual problems over which the abstrac- 
tion has occurred. For example, an abstract 
schema for the convergence solution would 
specify that multiple, low-intensity forces 
converge from different directions on a cen- 
tral target, but it would not specify that the 
forces are soldiers (or rays) or that the tar 
get is a fortress (or a tumor). A number of 


procedures can be induced by comparing 
two or more analogous problems (with their 
solutions) or by successfully solving one 
problem by analogy to another (solved) 
problem, and such schema induction in turn 
facilitates understanding and solution of sub- 
sequent analogous problems (e.g., Bassok 
& Holyoak, 1989; Gick & Holyoak, 1983; 
Novick & Holyoak, 1991; Ross & Kennedy, 
1990). Research on solution schemas is 
discussed in more detail by Holyoak 
(Chap. 6). 

In the remainder of this section, we dis- 
cuss some of the recent research on repre- 
sentation schemas (Hurley & Novick, 2004; 
Novick, 2001; Novick, Hurley, & Francis, 
1999). This research shows that college stu- 
dents possess abstract schemas for three 
spatial diagrams — matrices, networks, and 
hierarchies — that are important tools for 
understanding and solving problems from a 
variety of domains (see Tversky, Chap. 10, 
for a general review of visuospatial reason- 
ing). These schemas presumably were in- 
duced over the course of students’ in-school 
and out-of-school experiences with concrete 
instances of these diagrams in use (Novick, 
2001). For example, matrices are used for 
multiplication tables, time schedules, grade 
books, and seating charts. The spatial dia- 
gram schemas seem to be at an intermedi- 
ate level of generality (Novick et al., 1999): 
Each type of diagram is best suited for a par- 
ticular type of relational structure, regardless 
of the content domain in which that struc- 
ture is embedded. For example, a matrix is 
appropriate whenever (1) all possible com- 
binations of items across two sets must be 
considered, (2) the relation between items is 
associative (i.e., nondirectional), and (3) it is 
important to be able to distinguish between 
items that are related and those that are not 
(Novick & Hurley, 2001). The abstract rep- 
resentation schemas are more useful than are 
specific relevant example problems for un- 
derstanding the structures of novel problems 
(Novick et al., 1999). 

To measure problem understanding, 
Novick et al. (1999) asked subjects to select 
the most appropriate type of spatial diagram 
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problems. (Solving these problems would 
have required using analytical or mathemati- 
cal reasoning.) In one experiment, some sub- 
jects participated in a specific example con- 
dition, whereas other subjects participated 
in a general category condition. The initial 
task in the specific example condition pro- 
vided subjects with three example problems, 
each illustrating the use of a different one 
of the three spatial diagrams. Subjects spent 
6 minutes solving each example problem us- 
ing the diagrammatic representation given. 
In contrast, the initial task in the general cat- 
egory condition was designed to cue the ab- 
stract schemas that subjects were hypoth- 
esized to have in memory. Subjects were 
shown (one at a time) an abstract (empty) 
hierarchy, matrix, and network. Above each 
diagram was a short phrase naming the type 
of diagram (e.g., “a network or system of 
paths”). Subjects saw the abstract diagrams 
for 20 seconds each and were asked to famil- 
iarize themselves with the diagrams so they 
would have clearly in mind what each one is 
like for the next task. 

If college students possess at least rudi- 
mentary abstract schemas for the three spa- 
tial diagrams, then the brief (20-second) 
study times for the abstract diagrams 
presented in the general category condition 
should have been sufficient to cue those 
schemas. Abstract schemas provide a more 
reliable source of knowledge for understand- 
ing new problems than do specific example 
problems because the schemas do not con- 
tain specific story content (Holyoak, 1985). 
In contrast, example problems do contain 
specific content, and this content must be 
ignored when it mismatches that of the 
novel problems. Given this difference be- 
tween abstract schemas and concrete ex- 
amples, Novick et al. (1999) predicted that 
subjects in the general category condition 
would be more successful than those in the 
specific example condition at selecting the 
most appropriate type of representation for 
the test problems that required spatial di- 
agram representations. The results strongly 
supported this prediction: Cueing subjects’ 


each abstract diagram for 20 seconds greatly 
facilitated understanding of the test prob- 
lems compared with spending 6 minutes 
studying and successfully solving each of the 
relevant example problems. 


EXPERTISE 


The studies discussed in the previous two 
sections examined the effects of back- 
ground knowledge on problem representa- 
tion among typical college students. It has 
also proved to be especially interesting to 
investigate problem representation among 
people who differ with respect to their ex- 
pertise in the domain under investigation. 
Duncker (1945) was perhaps the first psy- 
chologist to note that experts and novices in 
a domain focus their attention on different 
aspects of that domain, leading them to con- 
struct problem representations that are quite 
different: Whereas experts’ representations 
tend to highlight solution-relevant structural 
features (in particular, meaningful causal re- 
lations among the objects in the problem), 
novices’ representations tend to highlight 
solution-irrelevant superficial features (e.g., 
the particular objects themselves or how the 
question is phrased). Evidence for these rep- 
resentational differences has been found us- 
ing a wide variety of experimental tasks and 
procedures. 

A number of studies have found that ex- 
perts’ attention is quickly captured by mean- 
ingful configurations within a presented 
stimulus, a result that calls to mind the 
Gestalt view that problem solving is related 
to perception. In contrast, novices’ atten- 
tion is focused on isolated components of 
the stimulus. Perhaps the earliest research 
investigating this issue comes from the do- 
main of chess (Chase & Simon, 1973; de 
Groot, 1966). In the typical study, subjects 
view 20 or more chess pieces arranged on a 
chess board for 5 seconds and then have to 
immediately reconstruct what they saw on a 
new chess board. The arrangement of chess 
pieces is either from the middle of a real 
game or is random. When the arrangement 
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matically as a function of expertise, from 
about 5 pieces for novices to about 20 pieces 
for players at the level of International Mas- 
ter or above (Gobet & Simon, 1996). Recall 
also improves with expertise for random po- 
sitions, although the effect is much smaller 
(from about 2.6 to 5.3 pieces; Gobet & 
Simon, 1996). These expertise differences 
can be explained by the hypothesis that ex- 
pert chess players have stored in memory 
meaningful groups (chunks) of chess pieces. 
Chase and Simon (1973) found evidence for 
such chunks based on an analysis of the la- 
tencies between recall of consecutive pieces. 
Better recall of structured or meaningful 
stimuli by experts than by novices has been 
found in many other domains as well: Circuit 
diagrams (Egan & Schwartz, 1979), com- 
puter programming (McKeithen, Reitman, 
Rueter, & Hirtle, 1981), medicine (Coughlin 
& Patel, 1987; Myles-Worsley, Johnston, & 
Simons, 1988), basketball and field hockey 
(Allard & Starkes, 1991), and figure skating 
(Deakin & Allard, 1991). 

Evidence for representational differences 
between experts and novices also comes 
from studies in which subjects were asked 
to sort problems into groups based on how 
they would be solved. In one of the early 
studies using this methodology, Chi et al. 
(1981) asked students to group physics (me- 
chanics) word problems into categories of 
related problems. They found that advanced 
physics graduate students tended to group 
the problems according to the physics prin- 
ciples required for solution (e.g., conserva- 
tion of energy). In contrast, undergraduates 
who had successfully completed an intro- 
ductory physics course tended to group the 
problems according to the types of objects 
presented (e.g., springs versus pulleys versus 
inclined planes). 

Comparable results have been found in 
the domains of mathematics and com- 
puter programming using measures based on 
both problem sorting and free recall (Adel- 
son, 1981; McKeithen et al., 1981; Silver, 
1979, 1981; Weiser & Shertz, 1983). These 
knowledge-based differences in problem 


ematical domains. For example, Kindfield 
(1993/1994) analyzed the chromosome di- 
agrams produced by subjects who varied in 
their degree of formal training in genetics as 
they reasoned about the process of meiosis. 
She found that the more expert subjects pro- 
duced more abstract chromosome diagrams 
that highlighted the features that were bio- 
logically relevant to the problem at hand. In 
contrast, the diagrams of the less advanced 
subjects more literally resembled chromo- 
some appearance under a light microscope, 
including aspects such as dimensionality and 
shape that have no bearing on the process 
of meiosis. 

Similar findings also have emerged from 
research involving geometric analogies, a 
problem type that does not seem to in- 
volve detailed domain knowledge. Schiano, 
Cooper, Glaser, and Zhang (1989) asked 
high school students who had received very 
low or very high scores on a standardized 
geometric analogy test to sort proportional 
analogies (of the form A:A’::B:B’) involv- 
ing geometric figures into groups of related 
problems. They found that the low-scoring 
students tended to sort the problems accord- 
ing to superficial perceptual similarities. For 
example, they put the problems involving 
circles and those involving partially shaded 
hexagons into separate piles. In contrast, 
the high-scoring students tended to sort the 
problems according to the abstract, transfor- 
mational relations underlying solution. For 
example, they put the problems involving 
rotations and those involving size transfor- 
mations into separate piles. 

It is important to note that these rep- 
resentational differences between experts 
and novices (or between people who are 
highly skilled versus less skilled in a do- 
main) are a matter of emphasis and de- 
gree. With increasing expertise/knowledge, 
there is a gradual change in the focus of 
attention and in the problems that are 
seen as related, and the extremes are not 
quite as extreme as summaries of the differ- 
ences often suggest (e.g., Deakin & Allard, 
1991; Hardiman, Dufresne, & Mestre, 1989; 
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et al., 1988; Schoenfeld & Herrmann, 1982; 
Silver, 1981). 


The Interplay Between Representation 
and Solution 


So far, we have considered problem rep- 
resentation and the process of generating 
problem solutions separately. We noted at 
the outset, however, that these topics are 
inherently interrelated: The representation 
one constructs is likely to affect how one 
goes about generating a solution. A classic 
example comes from Wertheimer (1959). 
Students are generally taught how to com- 
pute the area of a parallelogram as shown in 
Figure 14.5(a). Wertheimer distinguished 
two groups of students based on their rep- 
resentations of the solution method. Some 
students constructed what we might call to- 
day a procedural representation. They were 
able to compute the area by rote appli- 
cation of the learned formula. The repre- 
sentations of other students reflected good 
conceptual understanding of the solution 
method, namely that a triangle can be cut 
off from one side of the geometric figure 
and pasted onto the other side to create 
a rectangle to which the learned formula 
then obviously applies. Wertheimer found 
that students who represented the problem 
as one of converting the parallelogram into 
a rectangle were able to find the area of 
the quadrilateral in Figure 14.5 (b) and that 
of the irregularly shaped geometric figure 
in Figure 14.5(c) by similar conversion of 
those figures into rectangles as shown by 
the superimposed dashed lines. In contrast, 
students who represented the parallelogram 
problem in terms of the appropriate formula 
to apply were stumped by the problems 
presented in Figures 14.5(b) and 14.5(c), 
because the formula is not applicable to 
those problems as presented (because the 
figures are not parallelograms). These results 
demonstrate that structural understanding 
(exemplified by the convert-to-rectangle so- 
lution method) enables solvers to recognize 


appearance. 

In this section, we review research that 
highlights this interplay between represen- 
tation and solution generation. The first part 
of our review focuses on problem solving in 
mathematics. As suggested by our initial ex- 
ample from Wertheimer (1959), this is a do- 
main in which the interplay between rep- 
resentation and solution generation is easy 
to see. We show how the effects on repre- 
sentation of several of the solver and prob- 
lem factors identified in the previous section 
have consequences for the solution method 
employed. In the second part of our re- 
view, we revisit the nature of insight prob- 
lem solving, a topic that is currently receiv- 
ing much attention. The research on this 
topic aims to sort out the inherent inter- 
play between representation and solution 
generation. 


Mathematical Problem Solving 


DOMAIN KNOWLEDGE 


Wertheimer (1959) found that structural un- 
derstanding helps solvers to see important 
similarities between problems that differ in 
appearance. Research reviewed in the repre- 
sentation section showed that experts (i.e., 
people with high domain knowledge) bet- 
ter understand the structure of problems 
within their domain of expertise than do 
novices (i.e., people with low domain knowl- 
edge). It therefore seems reasonable to pre- 
dict that the expertise-related differences 
in problem representation would affect the 
methods that experts and novices attempt to 
use to solve novel problems. We review two 
studies by Novick (1988) on mathematical 
problem solving by analogy that provide evi- 
dence for such a link between representation 
and solution. 

In one experiment, Novick (1988) rea- 
soned that arithmetic experts (i.e., people 
who are highly skilled at arithmetic) would 
be more likely than novices (i.e, people 
who are less skilled at arithmetic) to ap- 
ply a learned procedure to an analogous test 
problem with a different cover story, be- 
cause only experts would construct similar 
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Figure 14.5. Finding the area of (a) a 
parallelogram, (b) a quadrilateral, and (c) an 
irregularly shaped geometric figure. The solid 
lines indicate the geometric figures whose areas 
are desired. The dashed lines show how to 
convert the given figures into rectangles (i.e., 
they show solutions with understanding). 


representations for the two problems. The 
example problem concerned purchasing 
plants for a vegetable garden. The test 
problem concerned arranging members of a 
marching band into rows and columns. In 
the control condition, subjects attempted to 
solve the band problem after having been 
taught how to solve three unrelated prob- 
lems. In the experimental condition, one of 
the unrelated problems was replaced by the 
vegetable garden problem. The learned solu- 
tion procedure for this problem was based on 
finding the lowest common multiple (LCM) 
of three numbers and then examining mul- 
tiples of the LCM to find a number that 
fit certain constraints. This solution proce- 
dure is also appropriate for the band prob- 
lem. Alternatively, the band problem can be 
solved by examining multiples of the indi- 
vidual numbers given in the problem. The 
data strongly supported Novick’s hypothe- 
sis of differential transfer for experts and 
novices: Among the novice group, 6% of 
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procedure to solve the marching band prob- 
lem. Among the experts, in contrast, 56% of 
subjects in the experimental condition used 
the more efficient LCM procedure, com- 
pared with only 6% of subjects in the con- 
trol condition. Consistent with these results, 
Dunbar (2001) reported that when scientists 
attempted to resolve puzzles in their own 
work, they generally retrieved analogies on 
the basis of shared relational structure. 
Novick’s (1988) first experiment focused 
on the beneficial consequences of experts’ 
structurally based representations. Another 
experiment focused on potential negative 
consequences of novices’ superficially based 
representations. All subjects were initially 
taught to solve three problems. One prob- 
lem was the vegetable garden problem, 
which is similar in structure but dissimilar 
in story content to the marching band prob- 
lem. A second problem concerned seating 
people in rows and columns on an audi- 
torium stage. Despite its similarity in story 
content to the band problem, the audito- 
rium problem required a different solution 
procedure (i.e., the problems were struc- 
turally dissimilar). (Because the auditorium 
problem’s solution procedure is inappropri- 
ate for the band problem, control subjects 
almost never try to use that procedure to 
solve the band problem.) The third problem 
was unrelated to the band problem. Thus, 
when subjects received the band problem to 
solve, they could choose to use the LCM pro- 
cedure from the analogous vegetable garden 
problem, the incorrect procedure from the 
superficially similar auditorium problem, or 
some other solution method. As predicted, 
novices were more likely than experts to at- 
tempt to apply the incorrect procedure from 
the auditorium problem to the band prob- 
lem, and they were more persistent in their 
attempts to use this procedure. As many in- 
termediates as novices tried to use the incor 
rect procedure, but fewer tried to do so more 
than once. Thus, superficial features play a 
decreasing role in analogical problem solv- 
ing as expertise increases (also see Dunbar, 
2001). Replicating the results of the initial 
experiment, experts were more likely than 
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LCM procedure to solve the band problem. 


LEARNING ABOUT PROBLEM SUBGOALS REVISITED 


As we discussed earlier in connection with 
means-ends analysis and the Tower of Hanoi 
problem, solvers often generate subgoals 
when they are unable to directly apply a 
desired operator. Subgoals also have been 
identified as components of task structure 
that can be taught to learners (Catrambone, 
1998). For example, in a statistics class, the 
task of computing a statistic for testing a 
hypothesis concerning central tendency can 
be divided into three subgoals: calculate 
the observed value, the hypothesized value, 
and the appropriate standard error. Subgoals 
in this sense decompose the problem into 
conceptually distinct and meaningful parts. 
Identifying the right subgoals thus implies 
that one has a good understanding of the 
structure of the problem, that is, a good 
representation. 

Catrambone (1996, 1998) investigated 
the consequences for problem solving of 
instructional manipulations that affect sub- 
goal learning. He found that manipulat- 
ing solvers’ opportunity to learn an impor 
tant subgoal influenced their ability to solve 
probability problems involving the Poisson 
distribution and to adapt the learned pro- 
cedure to solve slightly altered problems. In 
one experiment, Catrambone (1996) manip- 
ulated subjects’ representations by varying 
whether the solution to the example prob- 
lem provided a label for the subgoal of find- 
ing the total number of objects of type X. 
Then he gave subjects several problems to 
solve, some of which were isomorphic to the 
example problem and some of which pro- 
vided somewhat different information about 
the objects relevant to the subgoal. He found 
that all subjects were highly successful at 
solving the isomorphic problems, which re- 
quired the same solution method as the ex- 
ample problem. However, for the test prob- 
lems that required a different method for 
finding the total number of objects of type 
X, subjects who had learned the subgoal per- 
formed much better than those who had not 


is, when solvers had good conceptual un- 
derstanding of the numeric quantity they 
needed to compute, they were better able to 
devise a new method for finding that quan- 
tity when the expected information was not 
provided in the problem. This result is remi- 
niscent of Wertheimer’s (1959) findings with 
the parallelogram problem and related area 
problems (Figure 14.5). 


OBJECT-BASED INFERENCES FROM STORY CONTENT 


In the section on problem representation, we 
described Hayes and Simon’s (1977) study in 
which differences in the texts of the transfer 
and change monsters and globes problems 
led to differences in the representations 
solvers constructed for those two problem 
isomorphs. We also described Duncker’s 
(1945) candles problem, in which the given 
objects (boxes) evoked inferences pertain- 
ing to their functional role (containers). 
In related work, Bassok and her colleagues 
have found that the objects in the texts of 
mathematical word problems affect (1) how 
people represent the described problem sit- 
uation (i.e, the situation model they con- 
struct) and, accordingly, (2) which math- 
ematical solution, or mathematical model, 
they select or construct (for a review, see 
Bassok, 2001). 

One set of studies varied the objects 
in mathematically isomorphic word prob- 
lems involving constant change (Alibali, 
Bassok, Solomon, Syc, & Goldin-Meadow, 
1999; Bassok & Olseth, 1995). The objects 
were chosen to evoke situation models in- 
volving either discrete or continuous change 
(e.g., constant change in the number of 
books per shelf on consecutive shelves of a 
bookcase or constant change in the amount 
of air pressed per minute into a hot air bal- 
loon, respectively). In Alibali et al.’s (2999) 
study, subjects had to describe the problems 
to a confederate and solve the problems. 
Subjects’ internal representations of the 
manner of change (i.e., their situation mod- 
els) for each problem were coded from their 
speech and, separately, from their gestures. 
The solution method a subject used for each 
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strategy or the average strategy, which are 
compatible, respectively, with a representa- 
tion of change as a set of discrete events 
or as a single event. The results indicated 
that when subjects were judged to have con- 
structed a situation model involving discrete 
change (based on both speech and gesture), 
they were most likely to use the discrete sum 
strategy for solution. In contrast, when they 
constructed a situation model involving con- 
tinuous change, they were most likely to use 
the continuous sum strategy for solution. 
Another set of studies varied the seman- 
tic symmetry between object pairs in mathe- 
matical word problems and found that peo- 
ple’s solutions of these problems tended to 
have a corresponding mathematical symme- 
try. Bassok, Chase, and Martin (1998) pro- 
posed that the objects in a problem (e.g., 
tulips, vases) activate semantic and prag- 
matic knowledge that evokes relational in- 
ferences (e.g., the “contain” relation), which 
people include in their representations of the 
described situation. These situation mod- 
els, in turn, guide the selection of struc- 
turally analogous mathematical solutions. In 
the tulips and vases example, because the in- 
ferred containment relation between the ob- 
jects is asymmetric (tulips are in vases rather 
than vice versa), people select a mathemat- 
ically asymmetric solution (e.g., the divi- 
sion operation, which is asymmetric because 
a+b +#b =a). In a complementary way, 
objects from the same taxonomic category 
(e.g., tulips, roses) evoke a symmetric se- 
mantic relation (both tulips and roses are 
flowers), and the semantically symmetric sit- 
uation model leads people to select a mathe- 
matically symmetric solution (e.g., the addi- 
tion operation, which is symmetric because 
a+ b=b 4 a). Bassok et al. refer to this 
two-stage process as semantic alignment. 
Semantic alignments affect how students 
solve novel mathematical word problems. 
For example, Bassok, Wu, and Olseth (1995) 
asked college students to solve unfamiliar 
permutation problems that involved random 
assignment of three objects from one set to 
another set. They used two sets of mathe- 
matically identical problems that varied with 


(m) and those in the assigned set (1). Most 
subjects who attempted to solve these novel 
problems arrived at incorrect solutions that 
revealed systematic effects of semantic align- 
ment. When the problems involved assign- 
ment of semantically asymmetric sets (eg., 
m computers assigned to n secretaries), the 
solutions of most subjects placed the num- 
bers representing the two sets in mathemati- 
cally asymmetric structural roles (e.g., m3 /n! 
or m/3n); however, when the problems in- 
volved assignment of semantically symmet- 
ric sets (e.g., m doctors from one hospital 
assigned to n doctors from another hospi- 
tal), the solutions of most subjects placed the 
numbers representing the two sets in math- 
ematically symmetric structural roles [eg., 
(m + n)/(mn)3, 3/(m + n)!]. That is, the 
incorrect solutions students generated to 
the permutation problems were structurally 
analogous to the semantic relation evoked 
by the paired sets. 

Semantic alignments also determine the 
relative difficulty of mathematically isomor- 
phic problems. Martin and Bassok (in press) 
asked middle school, high school, and col- 
lege students to solve simple division word 
problems, such as the following: “At a cer 
tain university, there are 3,450 students. 
There are 6 times as many students as pro- 
fessors. How many professors are there?” 
In this example, the semantic relation be- 
tween the described sets is asymmetric (pro- 
fessors teach students) and therefore seman- 
tically aligned with the correct (asymmetric) 
division operation. In other problems, the 
semantic relation between the described 
sets was symmetric and therefore misaligned 
with the correct (asymmetric) division op- 
eration. For example: “On a given day, a 
certain factory produces 3,450 nails. It pro- 
duces 6 times as many nails as screws. How 
many screws does it produce?” Students at 
all grade levels were more successful at solv- 
ing the aligned than the misaligned prob- 
lems, although the difference was most pro- 
nounced in middle school: 80% of seventh 
graders solved the students and professors 
problem, but only 40% solved the nails and 
screws problem. 
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Inoue (2002) (cand evidence from electro- 
physiological data that semantic alignments 
occur very early in the solution process, 
when solvers read mathematical problems. 
People had to solve mathematically aligned 
problems, such as 3 tulips + 5 daisies =, 
and mathematically misaligned problems, 
such as 3 tulips + 5 vases =. Their event- 
related potentials (ERPs) revealed a sig- 
nificantly larger response of a certain spe- 
cific type (the N4oo response, a negative 
electrical response occurring approximately 
400 ms after the event) to the misaligned 
target word (vases) than to the aligned tar- 
get word (daisies). This pattern is consistent 
with other evidence that N4o0 is evoked by 
detection of semantic anomalies. 


Insight Problem Solving Revisited 


OVERVIEW 


We introduced the notion of insight in 
our discussion of perceptual factors affect- 
ing solvers’ representations of the nine-dot 
problem. As we mentioned, the Gestalt 
view (e.g., Duncker, 1945; Maier, 1931; see 
Ohlsson, 1984a, for a review) is that insight 
problem solving is characterized by an ini- 
tial work period during which no progress 
toward solution is made (i-e., an impasse), a 
sudden restructuring of one’s problem repre- 
sentation to a more suitable form, followed 
immediately by the sudden appearance of 
the solution. Thus, solving insight problems 
is all about representation with essentially 
no role for a step-by-step process of gen- 
erating the solution. Although subsequent 
and contemporary researchers concur with 
the Gestalt view that getting the right rep- 
resentation is crucial, this view does not pro- 
vide a complete understanding of the na- 
ture of insight solutions because the solution 
does not necessarily arise suddenly or full- 
blown following restructuring (e.g., Weis- 
berg & Alba, 1981). Kershaw and Ohlsson 
(2004) argued that insight problems are dif- 
ficult because the key behavior required for 
solution may be hindered by perceptual fac- 
tors (this is the Gestalt perspective), back- 
ground knowledge, and/or process factors 
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quired to find the solution). A full under 
standing of insight problem solving, like non- 
insight problem solving, requires attention 
to both representation and process. The in- 
terplay between these two factors is illus- 
trated in the two subsections that follow in 
which we consider (1) whether insight solu- 
tions arise full blown and (2) what explains 
the initial impasse and its resolution. 


DO INSIGHT SOLUTIONS ARISE FULL BLOWN? 


We noted in our earlier discussion of 
Weisberg and Alba’s (1981) research that so- 
lution of the nine-dot problem was neither 
quick nor direct following the restructur- 
ing hint. For example, subjects who solved 
the problem generally required 5 to 11 so- 
lution attempts after the hint to achieve 
success. Multiple solution attempts were 
needed because the required restructuring of 
one’s problem representation — realizing that 
(1) the lines may extend outside the imagi- 
nary square boundary formed by the dots, 
and (2) they may intersect at points in 
space that do not contain dots (Kershaw & 
Ohlsson, 2004) — suggests a new problem 
space, with alternative operators, through 
which the solver can search for the correct 
solution (Lung & Dominowski, 1985; Ohls- 
son, 1984b; Weisberg & Alba, 1981). 

For other problems, the required restruc- 
turing “brings the goal state within the hori- 
zon of mental look-ahead” (Ohlsson, 198 4b, 
p. 124), yielding insight in the traditional 
sense of sudden understanding of the solu- 
tion. For example, explain the following sit- 
uation (Durso, Rea, & Dayton, 1994, p. 95): 
“A man walks into a bar and asks for a glass 
of water. The bartender points a shotgun at 
the man. The man says ‘Thank you,’ and 
walks out.” The solution to this problem typ- 
ically pops into mind suddenly and fully in- 
tact, accompanied by an irresistible feeling 
of “aha!” Moreover, the solver has no aware- 
ness of incremental progress toward the goal 
such as that which accompanies search so- 
lutions. (The solution to the barroom puz- 
zle is that the man had the hiccups. The 
bartender scared him with the gun, which 
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yield such “pop-out” solutions (e.g., Mendel- 
sohn & O’Brien, 1974), especially among 
highly skilled anagram solvers (Novick & 
Sherman, 20034). 

For problems that yield pop-out 
solutions — that is, for which solvers 
have the phenomenological experience of 
insight — the question remains as to whether 
the solutions arise full blown or through 
the gradual accumulation of relevant partial 
information as for the nine-dot problem and 
noninsight problems (e.g., simplifying alge- 
bra equations to solve for X). Durso et al. 
(1994) investigated this issue using the 
barroom puzzle. In one experiment, they 
collected similarity ratings for 12 pairs of 
concepts at several points during subjects’ 
solution attempts — before and after reading 
the puzzle, every 10 minutes until the 
puzzle was solved, and immediately after 
the solution. The concept pairs included 
two insight pairs (surprise/remedy and 
relieved /thank you) that the results of an ini- 
tial experiment showed were connected in 
the conceptual networks of solvers but not 
nonsolvers. The results suggested that the 
key restructuring required for solution did 
not arise full-blown contrary to the Gestalt 
view of insight: The two insight pairs that 
were critical for solution were seen as 
dissimilar at the first two time points, mod- 
erately similar at the next two time points, 
and highly similar after solution. In contrast, 
the unrelated pairs (eg., pretzel/shotgun) 
were seen as dissimilar and the related 
pairs (e.g., shotgun/loaded) as similar at all 
time points. 

Novick and Sherman (2003a) noted, 
however, that having to repeatedly rate the 
similarity of concepts that were critical for 
solution may have changed subjects’ solu- 
tion strategies. This possibility led them to 
provide an additional test of the hypothesis 
using anagrams. The accrual of partial infor- 
mation was tested using a solvability judg- 
ment task in which subjects had to indi- 
cate whether letter strings (e.g., nrtai, botda) 
could be unscrambled to form an English 
word (only the first of the two examples 
is solvable — train). A deadline procedure 


ments based on any partial information that 
had accrued prior to the deadline. On av- 
erage, subjects’ responses were made within 
approximately 650 or 1130 ms after the on- 
set of the letter string. By testing highly 
skilled and less skilled anagram solvers on 
anagrams that were known to yield pop-out 
solutions (for experts) or not, Novick and 
Sherman were able to assess whether pop- 
out solutions arise full blown or are preceded 
by the gradual accumulation of partial infor- 
mation (outside awareness). Consistent with 
Durso et al.’s (1994) results, and contrary to 
the Gestalt view, they found that pop-out so- 
lutions arise gradually through the accumu- 
lation of relevant partial information (also 
see Bowden & Jung-Beeman, 2003). 

Despite this important similarity be- 
tween insight and noninsight solutions, phe- 
nomenologically, the two types of solutions 
are different. The solver is aware of the ac- 
cumulation of partial information for non- 
insight solutions — for example, consider the 
Hobbits and Orcs problem or the problem 
of simplifying an algebra equation to solve 
for X — but that accumulation occurs out- 
side awareness for insight solutions (eg., 
the barroom puzzle, anagrams). Novick and 
Sherman (2003a, 2003b) hypothesized that 
pop-out solutions to anagrams, which are 
characteristic of experts, may result from 
a parallel constraint satisfaction process; in 
contrast, nonpop-out anagram solutions re- 
sult from a conscious process of serially test- 
ing and rejecting hypotheses (e.g., Mendel- 
sohn & O’Brien, 1974). 


THE IMPASSE AND ITS RESOLUTION 


As discussed by Knoblich, Ohlsson, Haider, 
and Rhenius (1999), theories of insight prob- 
lem solving need to explain two phenom- 
ena concerning the interplay between rep- 
resentation and solution generation: (1) why 
solvers initially reach an impasse in solving a 
problem for which they have the necessary 
knowledge to generate the solution, and (2) 
what enables them to break out of the im- 
passe. Two recent theories have attempted to 
account for these phenomena — MacGregor, 
Ormerod, and Chronicle’s (2001) progress 
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resentational change theory. 

According to the progress monitoring 
theory, solvers use hill climbing (see “Prob- 
lem Solving as Search through a Problem 
Space”) in their solution attempts for in- 
sight as well as noninsight problems. Solvers 
are hypothesized to monitor their progress 
toward solution using a criterion generated 
from the problem’s current state. For the 
nine-dot problem, for example, this criterion 
is the number of dots through which lines 
have been drawn relative to the number of 
dots remaining. If solvers reach criterion fail- 
ure, they seek alternative solutions by trying 
to relax one or more problem constraints. 
The nine-dot problem is difficult, according 
to this theory, because criterion failure is not 
reached until the fourth move (recall that 
the problem must be solved in four moves). 
MacGregor et al. (2001) found support for 
this theory using several variants of the nine- 
dot problem (also see Ormerod, MacGregor, 
& Chronicle, 2002). 

According to Knoblich et al.’s (1999) rep- 
resentational change theory, insight prob- 
lems are highly likely to evoke initial 
representations in which solvers place inap- 
propriate constraints on their solution at- 
tempts. Impasses are resolved by revising 
one’s representation of the problem. They 
tested this theory using Roman numeral 
matchstick arithmetic problems in which 
solvers must move one stick to a new lo- 
cation to change a false numeric statement 
(e.g., VI = VIII + Ill) into a statement that 
is true. According to Knoblich et al.’s the- 
ory, rerepresentation may happen through 
either of two mechanisms — constraint re- 
laxation or chunk decomposition. Constraint 
relaxation involves deactivating some knowl- 
edge element that has constrained the op- 
erators being considered, thereby allowing 
application of new operators: For example, 
changing II + to III — requires relaxation of 
the value constraint (numeric values do not 
change except by applying an operation that 
produces a compensating change in some 
other value). Chunk decomposition involves 
breaking the bonds that link components 
of a meaningful unit in the problem: For 


composition of the plus sign. (The solution 
to the this problem is to break apart the first 
V and change it to an X, yielding XI = VIII + 
III). Knoblich et al. found good support for 
their theory using solution rate and solution 
time as their dependent measures. Knoblich, 
Ohlsson, and Raney (2001) found additional 
support using eye fixation data. 

Jones (2003) attempted to distinguish the 
progress monitoring and representational 
change theories using eye fixation data as 
subjects solved the car park problem. In this 
problem, the goal is to maneuver a taxi out of 
a car park. Other cars need to be moved out 
of the way, and there are constraints on how 
cars may be moved. Jones’ results supported 
predictions from both theories, although the 
effects of the experimental manipulations 
suggested that the representational change 
theory is a better predictor of performance. 
Based on his data, Jones argued that the two 
theories should be combined into a single 
theory. This makes sense because Knoblich 
et al.’s (1999) theory focuses more on the 
representational aspect of problem solution, 
whereas MacGregor et al.’s (2001) theory 
focuses more on the step-by-step solution 
process. Jones noted that the progress moni- 
toring theory provides an account of the so- 
lution process up to the point that the im- 
passe is reached and representational change 
is sought. The representational change the- 
ory picks up at this point and explains how 
insight may be achieved. 


Conclusions and Directions 
for Future Research 


In this chapter, we examined two broad 
components of the problem-solving 
process — representation and solution gener- 
ation. Although it is possible to focus one’s 
research on one or the other of these com- 
ponents, a full understanding of problem 
solving requires an integration of the two, 
for the representation one constructs for a 
problem determines (or at least constrains) 
how one goes about trying to generate a 
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ematical problem solving, as we discussed 
in the previous section. Consideration of 
both representation and solution generation 
also seems to be behind the resurgence of 
interest in insight problem solving. This 
new strategy for investigating insight seems 
to be yielding progress in understanding this 
fascinating phenomenon that is at the core 
of human creative endeavors. We believe 
the interplay between representation and 
solution generation will lead to significant 
progress in understanding the full range of 
activities considered to be problem solving. 
Elevating this interplay to the status of a 
core assumption, we want to suggest three 
directions for future research. 

First, we would stress the importance of 
conducting educationally relevant research. 
Students spend a considerable amount of 
time both solving problems and learning 
how to solve problems. Society expects 
that the problem-solving lessons learned in 
school — from how to solve math problems 
to how to design and execute a science fair 
project to how to analyze literature — will 
transfer to students’ adult lives for the bet- 
terment of the world. We believe that a two- 
pronged effort is needed here: (1) It is im- 
portant to gain a better understanding of 
students’ contributions to problem solving. 
What are their goals, beliefs, strategies, and 
conceptions? How do they construct mean- 
ing and infer structure? (2) At the same 
time, there is an objective reality to prob- 
lems, messy though they may sometimes be. 
The nature of a problem’s underlying struc- 
ture places constraints on the types of rep- 
resentations that will be useful or appropri- 
ate, which in turn determine the types of 
solution methods that will be effective and 
efficient. It is important, therefore, to un- 
derstand the factors that facilitate or hinder 
a student’s ability to represent a problem’s 
structure as well as to investigate methods 
for helping students to succeed in this en- 
deavor. The National Council of Teachers of 
Mathematics (2000) similarly promotes the 
importance of teaching students how to cre- 
ate and use a variety of different types of 
representations to model phenomena in the 


ing, as well as creative invention, all require 
appropriate models as their starting point. 

Second, the trend toward examining 
more complex, knowledge-intensive prob- 
lems should continue. Although the avail- 
able evidence suggests that many of the 
conclusions about problem solving drawn 
from research on well-defined problems are 
applicable to ill-defined problems, messy, 
knowledge-intensive, real-world problems 
may not be simply scaled-up versions of lab- 
oratory tasks or of tasks practiced in school. 
The critical problems of the day, at any given 
point in history, are always ill defined in 
some way. Investigation of such problems 
(e.g., in science, medicine, and technology) is 
likely to yield both theoretical and practical 
payoffs. 

Finally, we come full circle and end where 
we began. The last direction is suggested by 
the definition of a problem given by Karl 
Duncker, arguably the father of research on 
problem solving. He defined a problem as 
a situation in which a desired goal cannot 
be attained by direct application of known 
operators, and so “there has to be recourse 
to thinking” (Duncker, i945, p. 1). Our 
review of problem-solving research in this 
chapter has been rather narrow — focusing 
on puzzles (e.g., Hobbits and Orcs, Tower 
of Hanoi, anagrams, the nine-dot problem) 
and on mathematical problems. However, 
Duncker’s reference to thinking is quite 
broad. By Duncker’s definition, humans en- 
gage in problem solving when they pur 
sue the following goal-directed activities: 
(a) placing objects into categories and mak- 
ing inferences based on category member- 
ship, (2) making inductive inferences from 
multiple instances, (3) reasoning by analogy, 
(4) identifying the causes of events, (5) de- 
ducing logical implications of given infor- 
mation, (6) making legal judgments, and 
(7) diagnosing medical conditions from his- 
torical and laboratory data. Much of the ma- 
terial included in the chapters on these top- 
ics in the present volume arguably could 
have appeared in our chapter on problem 
solving. Rather than engaging in a turf battle, 
we would suggest that research on problem 
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other areas of thinking, or that research in 
these other areas be informed by insights 
gained from research on what has more tra- 
ditionally been identified as problem solving. 
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CHAPTER 15 


Creativity 


Robert J. Sternberg 
Todd I. Lubart 
James C. Kaufman 
Jean E. Pretz 


Creativity is the ability to produce work that 
is novel (i.e., original, unexpected), high in 
quality, and appropriate (i.e., useful, meets 
task constraints) (Lubart, 1994; Ochse, 
1990; Sternberg, 1988a, 1999¢; Sternberg & 
Lubart, 1995, 1996). Creativity is a topic of 
wide scope that is important at both the in- 
dividual and societal levels for a wide range 
of task domains. At an individual level, cre- 
ativity is relevant, for example, when solv- 
ing problems on the job and in daily life. 
At a societal level, creativity can lead to 
new scientific findings, new movements in 
art, new inventions, and new social pro- 
grams. The economic importance of creativ- 
ity is clear because new products or ser 
vices create jobs. Furthermore, individuals, 
organizations, and societies must adapt ex- 
isting resources to changing task demands to 
remain competitive. 

This chapter attempts to provide readers 
with a basic understanding of the literature 
on creativity. It first reviews alternative ap- 
proaches to understanding creativity. Then 
it reviews alternative approaches to under- 
standing kinds of creative work. Finally, it 
draws some conclusions. 


Creativity may be viewed as taking place 
in the interaction between a person and the 
person’s environment (Amabile, 1996; Csik- 
szentmihalyi, 1996, 1999; Feldman, 1999; 
Feldman, Csikszentmihalyi, & Gardner, 
1994; Sternberg, 1985 a; Sternberg & Lubart, 
1995). According to this view, the essence 
of creativity cannot be captured just as an 
intrapersonal variable. Thus, we can charac- 
terize a person’s cognitive processes as more 
or less creative (Finke, Ward, & Smith, 1992; 
Rubenson & Runco, 1992; Weisberg, 1986), 
or the person as having a more or less cre- 
ative personality (Barron, 1988; Feist, 1999). 
We further can describe the person as having 
a motivational pattern that is more or less 
typical of creative individuals (Hennessey 
& Amabile, 1988), or even as having back- 
ground variables that more or less dispose 
that person to think creatively (Simonton, 
1984,1994). However, we cannot fully judge 
that person’s creativity independent of the 
field and the temporal context in which the 
person works. 

For example, a contemporary artist might 
have thought processes, personality, motiva- 
tion, and even background variables similar 
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today in the style of Monet or of Impression- 
ism in general, probably would not be judged 
to be creative in the way Monet was. Artists, 
including Monet, have experimented with 
Impressionism, and unless the contemporary 
artist introduced some new twist, he or she 
might be viewed as imitative rather than cre- 
ative. 

The importance of context is illustrated 
by the difference, in general, between cre- 
ative discovery and rediscovery. For ex- 
ample, BACON and related programs of 
Langley, Simon, Bradshaw, and Zytkow 
(1987) rediscover important scientific the- 
orems that were judged to be creative dis- 
coveries in their time. The processes by 
which these discoveries are made via com- 
puter simulation are presumably not iden- 
tical to those by which the original discov- 
erers made their discoveries. One difference 
derives from the fact that contemporary pro- 
grammers can provide, in their programming 
of information into computer simulations, 
representations and particular organizations 
of data that may not have been available 
to the original creators. However, putting 
aside the question of whether the processes 
are the same, a rediscovery might be judged 
to be creative with respect to the rediscov- 
erer but would not be judged to be creative 
with respect to the field at the time the re- 
discovery is made. Ramanujan, the famous 
Indian mathematician, made many such re- 
discoveries. A brilliant thinker, he did not 
have access in his early life to much of the 
recent literature on mathematics and so un- 
wittingly regenerated many discoveries that 
others had made before him. 


Alternative Approaches to Creativity 


Mystical Approaches to the Study 
of Creativity 


The study of creativity has always been 
tinged —some might say tainted — with asso- 
ciations to mystical beliefs. Perhaps the ear- 
liest accounts of creativity were based on di- 
vine intervention. The creative person was 


ing would fill with inspiration. The individ- 
ual would then pour out the inspired ideas, 
forming an otherworldly product. 

In this vein, Plato argued that a poet is 
able to create only that which the Muse dic- 
tates, and even today, people sometimes re- 
fer to their own Muse as a source of in- 
spiration. In Plato’s view, one person might 
be inspired to create choral songs, another, 
epic poems (Rothenberg & Hausman, 1976). 
Often, mystical sources have been suggested 
in creators’ introspective reports (Ghiselin, 
1985). For example, Rudyard Kipling re- 
ferred to the “Daemon” that lives in the 
writer’s pen: “My Daemon was with me in 
the Jungle Books, Kim, and both Puck books, 
and good care I took to walk delicately, 
lest he should withdraw....When your 
Daemon is in charge, do not think cons- 
ciously. Drift, wait, and obey” (Kipling, 
1985, p.162). 

The mystical approaches to the study of 
creativity have probably made it harder for 
scientists to be heard. Many people seem 
to believe, as they believe for love (see 
Sternberg, 1988b, 1988c), that creativity is 
something that just does not lend itself to 
scientific study because it is a more spiritual 
process. We believe it has been hard for sci- 
entific work to shake the deep-seated view of 
some that, somehow, scientists are treading 
where they should not. 


Pragmatic Approaches 


Equally damaging for the scientific study of 
creativity, in our view, has been the takeover 
of the field, in the popular mind, by those 
who follow what might be referred to as a 
pragmatic approach. Those taking this ap- 
proach have been concerned primarily with 
developing creativity, secondarily with un- 
derstanding it, but almost not at all with test- 
ing the validity of their ideas about it. 
Perhaps the foremost proponent of this 
approach is Edward De Bono, whose work 
on lateral thinking— seeing things broadly and 
from varied viewpoints — as well as other as- 
pects of creativity has had what appears to 
be considerable commercial success (e.g., De 
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is not with theory, but with practice. Thus, 
for example, he suggests using a tool such as 
“Positive-Minus-Interesting” (PMI) to focus 
on the aspects of an idea that are pluses, mi- 
nuses, and interesting. Or he suggests using 
the word “po,” derived from hypothesis, sup- 
pose, possible, and poetry, to provoke rather 
than to judge ideas. Another tool, that of 
“thinking hats,” has individuals metaphori- 
cally wear different hats, such as a white hat 
for data-based thinking, a red hat for intu- 
itive thinking, a black hat for critical think- 
ing, and a green hat for generative thinking, 
in order to stimulate seeing things from dif- 
ferent points of view. 

DeBono is not alone in this enterprise. 
Osborn (1953), based on his experiences 
in advertising agencies, developed the tech- 
nique of brainstorming to encourage people 
to solve problems creatively by seeking 
many possible solutions in an atmosphere 
that is constructive rather than critical 
and inhibitory. Gordon (1961) developed a 
method called synectics, which involves pri- 
marily seeing analogies, also for stimulating 
creative thinking. 

More recently, authors such as Adams 
(1974, 1986) and von Oech (1983) sug- 
gested that people often construct a series of 
false beliefs that interfere with creative func- 
tioning. For example, some people believe 
that there is only one “right” answer and that 
ambiguity must be avoided whenever possi- 
ble. People can become creative by identify- 
ing and removing these mental blocks. Von 
Oech (1986) also suggested that to be cre- 
ative we need to adopt the roles of explorer, 
artist, judge, and warrior in order to foster 
our creative productivity. 

These approaches have had considerable 
public visibility, and they may well be use- 
ful. From our point of view as psycholo- 
gists, however, most of these approaches lack 
any basis in serious psychological theory as 
well as serious empirical attempts to vali- 
date them. Of course, techniques can work 
in the absence of psychological theory or 
validation. However, the effect of such ap- 
proaches is often to leave people associating 
a phenomenon with commercialization and 


psychological study. 


The Psychodynamic Approach 


The psychodynamic approach can be con- 
sidered the first of the major twentieth- 
century theoretical approaches to the study 
of creativity. On the basis of the idea that 
creativity arises from the tension between 
conscious reality and unconscious drives, 
Freud (1908/1959) proposed that writers 
and artists produce creative work as a way to 
express their unconscious desires in a pub- 
licly acceptable fashion. These unconscious 
desires may concern power, riches, fame, 
honor, or love (Vernon, 1970). Case stud- 
ies of eminent creators, such as Leonardo da 
Vinci (Freud, 1910/1964), were used to sup- 
port these ideas. 

Later, the psychoanalytic approach in- 
troduced the concepts of adaptive regres- 
sion and elaboration for creativity (Kris, 
1952). Adaptive regression, the primary pro- 
cess, refers to the intrusion of unmodulated 
thoughts in consciousness. Unmodulated 
thoughts can occur during active problem 
solving but often occur during sleep, in- 
toxication from drugs, fantasies or day- 
dreams, or psychoses. Elaboration, the sec- 
ondary process, refers to the reworking and 
transformation of primary process mate- 
rial through reality-oriented, ego-controlled 
thinking. Other theorists (e.g., Kubie, 1958) 
emphasized that the preconscious, which 
falls between conscious reality and the en- 
crypted unconscious, is the true source 
of creativity because thoughts are loose 
and vague but interpretable. In contrast 
to Freud, Kubie claimed that unconscious 
conflicts actually have a negative effect 
on creativity because they lead to fixated, 
repetitive thoughts. More recent work has 
recognized the importance of both pri- 
mary and secondary processes (Noy, 1969; 
Rothenberg, 1979; Suler, 1980; Werner & 
Kaplan, 1963). 

Although the psychodynamic approach 
may have offered some insights into creativ- 
ity, psychodynamic theory was not at the 
center of the emerging scientific psychology. 
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psychology, such as structuralism, function- 
alism, and behaviorism, were devoting prac- 
tically no resources at all to the study of cre- 
ativity. The Gestaltists studied a portion of 
creativity — insight — but their study never 
went much beyond labeling, as opposed to 
characterizing the nature of insight. 

Further isolating creativity research, the 
psychodynamic approach and other early 
work on creativity relied on case studies 
of eminent creators. This methodology has 
been criticized historically because of the 
difficulty of measuring proposed theoretical 
constructs (e.g., primary process thought), 
and the amount of selection and interpreta- 
tion that can occur in a case study (Weisberg, 
1993). Although there is nothing a priori 
wrong with case study methods, the emerg- 
ing scientific psychology valued controlled, 
experimental methods. Thus, both theoret- 
ical and methodological issues served to 
isolate the study of creativity from main- 
stream psychology. 


Psychometric Approaches 


When we think of creativity, eminent artists 
or scientists such as Michelangelo or Einstein 
immediately come to mind. However, these 
highly creative people are quite rare and dif- 
ficult to study in the psychological labora- 
tory. In his American Psychological Asso- 
ciation address, Guilford (1950) noted that 
these problems had limited research on cre- 
ativity. He proposed that creativity could 
be studied in everyday subjects using paper- 
and-pencil tasks. One of these was the Un- 
usual Uses Test, in which an examinee thinks 
of as many uses for a common object (e.g., a 
brick) as possible. Many researchers adopted 
Guilford’s suggestion, and “divergent think- 
ing” tasks quickly became the main instru- 
ments for measuring creative thinking. The 
tests were a convenient way of comparing 
people on a standard “creativity” scale. 
Building on Guilford’s work, Torrance 
(1974) developed the Torrance Tests of 
Creative Thinking. These tests consist of 
several relatively simple verbal and figural 
tasks that involve divergent thinking plus 


be scored for fluency (total number of rel- 
evant responses), flexibility (number of dif- 
ferent categories of relevant responses), orig- 
inality (the statistical rarity of the responses), 
and elaboration (amount of detail in the 
responses). Some of the subtests from the 
Torrance battery include 


1. Asking questions: The examinee writes 
out all of the questions he or she can think 
of based on a drawing of a scene. 

2. Product improvement: The examinee 
lists ways to change a toy monkey so chil- 
dren will have more fun playing with it. 

3. Unusual uses: The examinee lists interest- 
ing and unusual uses of a cardboard box. 


4. Circles: The examinee expands empty 


circles into different drawings and titles 
them. 


A number of investigators have studied 
the relationship between creativity and in- 
telligence — at least as measured by IQ. 
Three basic findings concerning creativity 
and conventional conceptions of intelligence 
are generally agreed upon (see, e.g., Barron 
& Harrington, 1981; Lubart, 1994). First, 
creative people tend to show above-average 
IQs — often above 120 (see Renzulli, 1986). 
This figure is not a cutoff but rather an ex- 
pression of the fact that people with low or 
even average IQs do not seem to be well 
represented among the ranks of highly cre- 
ative individuals. Cox’s (1926) geniuses had 
an estimated average IQ of 165. Barron es- 
timated the mean IQ of his creative writers 
to be 140 or higher based on their scores on 
the Terman Concept Mastery Test (Barron, 
1963, p. 242). It should be noted that the 
Concept Mastery Test is exclusively verbal 
and thus provides a somewhat skewed esti- 
mate of IQ. The other groups in the Institute 
for Personality Assessment (IPAR) studies, 
that is, mathematicians and research scien- 
tists, were also above average in intelligence. 
Anne Roe (1952, 1972), who did similarly 
thorough assessments of eminent scientists 
before the IPAR group was set up, estimated 
IQs for her participants ranged between 121 
and194, with medians between 137 and 166, 
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bal, spatial, or mathematical. 

Second, an IQ above 120, does not seem 
to matter as much to creativity as it does 
when an IQ is below 120. In other words, cre- 
ativity may be more highly correlated with 
IQ below an IQ of 120, but only weakly 
or not at all correlated with it above an 
IQ of 120. [This relationship is often called 
the threshold theory. See the contrast with 
Hayes’s (1989) certification theory discussed 
below. ] In the architects’ study, in which the 
average IQ was 130 (significantly above av- 
erage), the correlation between intelligence 
and creativity was —.08, not significantly dif- 
ferent from zero (Barron, 1969, p. 42). How- 
ever, in the military officer study, in which 
participants were of average intelligence, the 
correlation was .33 (Barron, 1963, p. 219). 
These results suggest that extremely highly 
creative people often have high IQs, but not 
necessarily that people with high IQs tend 
to be extremely creative (see also Getzels & 
Jackson, 1962). 

Some investigators (e.g., Simonton, 1994; 
Sternberg, 1996) have suggested that very 
high IQ may actually interfere with creativ- 
ity. Those who have very high IQs may be 
so highly rewarded for their [Q-like (analyt- 
ical) skills that they fail to develop the cre- 
ative potential within them, which may then 
remain latent. 

Third, the correlation between IQ and 
creativity is variable, usually ranging from 
weak to moderate (Flescher, 1963; Getzels & 
Jackson, 1962; Guilford, 1967; Herr, Moore, 
& Hasen, 1965; Torrance, 1962; Wallach & 
Kogan, 1965; Yamamoto, 1964). The corre- 
lation depends in part on what aspects of cre- 
ativity and intelligence are being measured, 
how they are being measured, and in what 
field the creativity is manifested. The role 
of intelligence is different in art and music, 
for instance, than it is in mathematics and 
science (McNemar, 1964). 

An obvious drawback to the tests used 
and assessments done by Roe and Guilford 
is the time and expense involved in adminis- 
tering them, as well as the subjective scor 
ing of them. In contrast, Mednick (1962) 
produced a 30-item, objectively scored, 


Remote Associates Test (RAT). The test is 
based on his theory that the creative think- 
ing process is the “forming of associative el- 
ements into new combinations which either 
meet specified requirements or are in some 
way useful. The more mutually remote the 
elements of the new combination, the more 
creative the process or solution” (Mednick, 
1962). Because the ability to make these 
combinations and arrive at a creative solu- 
tion necessarily depends on the existence of 
the combinations (i.e., the associative ele- 
ments) in a person’s knowledge base, and 
because the probability and speed of attain- 
ment of a creative solution are influenced 
by the organization of the person’s associa- 
tions, Mednick’s theory suggests that creativ- 
ity and intelligence are very related; they are 
overlapping sets. 

Moderate correlations of .55, .43, and 
.41 have been shown between the RAT 
and the WISC (Wechsler Intelligence Scale 
for Children), the SAT verbal, and the 
Lorge-Thorndike Verbal intelligence mea- 
sures, respectively (Mednick & Andrews, 
1967). Correlations with quantitative intel- 
ligence measures were lower (r= .20 — .34). 
Correlations with other measures of cre- 
ative performance have been more variable 
(Andrews, 1975). 

This psychometric approach for measur- 
ing creativity had both positive and nega- 
tive effects on the field. On the positive 
side, the tests facilitated research by provid- 
ing a brief, easy to administer, objectively 
scorable assessment device. Furthermore, 
research was now possible with “everyday” 
people (i.e, noneminent samples). How- 
ever, there were also some negative ef- 
fects. First, some researchers criticized brief 
paper-and-pencil tests as trivial, inadequate 
measures of creativity; larger productions 
such as actual drawings or writing samples 
should be used instead. Second, other crit- 
ics suggested that no fluency, flexibility, orig- 
inality, or elaboration scores captured the 
concept of creativity. In fact, the definition 
and criteria for creativity are a matter of on- 
going debate, and relying on the objectively 
defined statistical rarity of a response with 


3 56 THE CAMBRIDGE HANDBOOK OF THINKING AND REASONING 


regard to allRsstiestadsons dttes Hissbiienary ale problem. This problem requires partici- 


population is only one of many options. 
Other possibilities include using the social 
consensus of judges (see Amabile, 1983). 
Third, some researchers were less enchanted 
by the assumption that noneminent sam- 
ples could shed light on eminent levels of 
creativity, which was the ultimate goal for 
many studies of creativity (e.g., Simonton, 
1984). Thus, a certain malaise developed 
and continues to accompany the paper-and- 
pencil assessment of creativity. Some psy- 
chologists, at least, avoided this measure- 
ment quagmire in favor of less problematic 
research topics. 


Cognitive Approaches 


The cognitive approach to creativity seeks 
understanding of the mental representations 
and processes underlying creative thought 
(see Lubart, 2000-2001). By studying per- 
ception or memory, one would already be 
studying the bases of creativity; thus, the 
study of creativity would merely represent 
an extension, and perhaps not a very large 
one, of work that is already being done un- 
der another guise. For example, in the cogni- 
tive area, creativity was often subsumed un- 
der the study of intelligence (see Sternberg, 
Chap. 31). We do not argue with the idea 
that creativity and intelligence are related 
to each other (Lubart, 2003; Sternberg 
& O'Hara, 1999). However, the subsump- 
tion has often been so powerful that re- 
searchers such as Wallach and Kogan (1965), 
among others, had to write at length on why 
creativity and intelligence should be viewed 
as distinct entities. In more recent cognitive 
work, Weisberg (1986, 1988, 1993, 1999) 
has proposed that creativity involves essen- 
tially ordinary cognitive processes yielding 
extraordinary products. A similar point has 
been made by Perkins (1981). Weisberg at- 
tempted to show that the insights depend on 
subjects using conventional cognitive pro- 
cesses (e.g., analogical transfer) applied to 
knowledge already stored in memory. He 
did so through the use of case studies of 
eminent creators and laboratory research, 
such as studies with Duncker’s (1945) can- 


pants to attach a candle to a wall using only 
objects available in a picture (candle, box of 
tacks, and book of matches). Langley et al. 
(1987) made a similar claim about the ordi- 
nary nature of creative thinking. 

As a concrete example of this approach, 
Weisberg and Alba (1981) had people solve 
the notorious nine-dot problem. In this 
problem, people are asked to connect all of 
the dots, which are arranged in the shape 
of a square with three rows of three dots 
each, using no more than four straight lines, 
never arriving at a given dot twice, and never 
lifting their pencil from the page. The prob- 
lem can be solved only if people allow their 
line segments to go outside the periphery of 
the dots. Typically, solution of this task had 
been viewed as hinging upon the insight that 
one had to go “outside the box.” Weisberg 
and Alba showed that even when people 
were given that insight, they still had diff- 
culty in solving the problem. In other words, 
whatever is required to solve the nine-dot 
problem, it is not just some kind of extra- 
ordinary insight. 

There have been studies with both hu- 
man subjects and computer simulations of 
creative thought. Approaches based on the 
study of human subjects are perhaps proto- 
typically exemplified by the work of Finke, 
Ward, and Smith (1992) (see also contri- 
butions to Smith, Ward, & Finke, 1995; 
Sternberg & Davidson, 1994; Ward, Smith, 
& Finke, 1999). Finke and his colleagues 
have proposed what they call the Geneplore 
model, according to which there are two 
main processing phases in creative thought — 
a generative phase and an exploratory phase. 
In the generative phase, an individual con- 
structs mental representations referred to as 
preinventive structures, which have proper- 
ties promoting creative discoveries. In the 
exploratory phase, these properties are used 
to come up with creative ideas. A number 
of mental processes may enter into these 
phases of creative invention, such as re- 
trieval, association, synthesis, transformation 
(see Tversky, Chap. 10), analogical trans- 
fer (see Holyoak, Chap. 6), and categori- 
cal reduction (i.e., mentally reducing objects 
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descriptions). In a typical experimental test 
based on the model (Finke & Slayton, 1988), 
participants will be shown parts of objects, 
such as a circle, a cube, a parallelogram, and 
a cylinder. On a given trial, three parts will 
be named, and participants will be asked to 
imagine combining the parts to produce a 
practical object or device. For example, par- 
ticipants might imagine a tool, a weapon, or 
a piece of furniture. The objects thus pro- 
duced are then rated by judges for their prac- 
ticality and originality. Morrison and Wallace 
(2002) found that judged creativity on such 
a task correlated strongly with the individu- 
als’ perceived imagery vividness. 

In work on convergent creative thinking 
that required participants to think in un- 
usual ways, we presented 80 individuals with 
novel kinds of reasoning problems that had a 
single best answer. For example, they might 
be told that some objects are green and oth- 
ers blue, whereas still other objects might be 
grue, meaning green until the year 2000 and 
blue thereafter, or bleen, meaning blue until 
the year 2000 and green thereafter. Or they 
might be told about four kinds of people on 
the planet Kyron, blens, who are born young 
and die young; kwefs, who are born old and 
die old; balts, who are born young and die 
old; and prosses, who are born old and die 
young (Sternberg, 1981, 1982; Tetewsky & 
Sternberg, 1986). Their task was to predict 
future states from past states, given incom- 
plete information. In another set of stud- 
ies, 60 people were given more conven- 
tional kinds of inductive reasoning problems, 
such as analogies, series completions, and 
classifications. However, the problems had 
premises preceding them that were either 
conventional (dancers wear shoes) or novel 
(dancers eat shoes). The participants had 
to solve the problems as though the coun- 
terfactuals were true (Sternberg & Gastel, 
1989a, 1989b). 

In these studies, we found that correla- 
tions with conventional kinds of tests de- 
pended on how novel or nonentrenched the 
conventional tests were. The more novel the 
items, the higher the correlations of our tests 
with scores on successively more novel con- 


lated for relatively novel items would tend 
to correlate more highly with more unusual 
tests of fluid abilities than with tests of crys- 
tallized abilities. We also found that when 
response times on the relatively novel prob- 
lems were componentially analyzed, some 
components better measured the creative as- 
pect of intelligence than did others. For ex- 
ample, in the “grue-bleen” task mentioned 
previously, the information processing com- 
ponent requiring people to switch from con- 
ventional green-blue thinking to grue-bleen 
thinking, and then back to green-blue think- 
ing again, was a particularly good measure of 
the ability to cope with novelty. 

Computer simulation approaches, re- 
viewed by Boden (1992, 1999), have as their 
goal the production of creative thought by a 
computer in a manner that simulates what 
people do. Langley, Simon, Bradshaw, and 
Zytkow (i987), for example, developed a 
set of programs that rediscover basic sci- 
entific laws. These computational models 
rely on heuristics — problem-solving guide- 
lines — for searching a data set or conceptual 
space and finding hidden relationships be- 
tween input variables. The initial program, 
called BACON, uses heuristics such as “if 
the value of two numeric terms increase to- 
gether, consider their ratio” to search data 
for patterns. One of BACON’s accomplish- 
ments has been to examine observational 
data on the orbits of planets available to 
Kepler and to rediscover Kepler’s third law 
of planetary motion. This program is un- 
like creative functioning, however, in that 
the problems are given to it in a struc- 
tured form, whereas creative functioning is 
largely about figuring out what the prob- 
lems are (see Runco, 1994). Further pro- 
grams have extended the search heuristics, 
the ability to transform data sets, and the 
ability to reason with qualitative data and 
scientific concepts. There are also models 
concerning an artistic domain. For example, 
Johnson-Laird (1988) developed a jazz im- 
provisation program in which novel devia- 
tions from the basic jazz chord sequences 
are guided by harmonic constraints (or 
tacit principles of jazz) and random choice 
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improvisation exist. 


Social-Personality and Social-Cognitive 
Approaches 


Developing in parallel with the cognitive ap- 
proach, work in the social-personality ap- 
proach has focused on personality variables, 
motivational variables, and the sociocul- 
tural environment as sources of creativity. 
Researchers such as Amabile (1983), Bar- 
ron (1968, 1969), Eysenck (1993), Gough 
(1979), MacKinnon (1965), and others 
noted that certain personality traits often 
characterize creative people. Through cor- 
relational studies and research contrasting 
high and low creative samples (at both emi- 
nent and everyday levels), a large set of po- 
tentially relevant traits has been identified 
(Barron & Harrington, 1951; Feist, 1999). 
These traits include independence of judg- 
ment, self-confidence, attraction to com- 
plexity, aesthetic orientation, openness to 
experience, and risk taking. 

Proposals regarding self-actualization and 
creativity can also be considered within the 
personality tradition. According to Maslow 
(1968), boldness, courage, freedom, spon- 
taneity, self-acceptance, and other traits lead 
a person to realize his or her full poten- 
tial. Rogers (1954) described the tendency 
toward self-actualization as having motiva- 
tional force and being promoted by a sup- 
portive, evaluation-free environment. These 
ideas, however, seem at odds with the many 
studies that have linked creativity and men- 
tal illness (e.g., Kaufman, 2001a, 20015; 
Kaufman & Baer, 2002; Ludwig, 1995). If full 
creative potential is truly linked with self- 
acceptance and other positive traits, then 
one would not expect to find so many emi- 
nent creative individuals to have such malad- 
justed and poor coping strategies (Kaufman, 
2002; Kaufman & Sternberg, 2000). 

Focusing on motivation for creativity, a 
number of theorists have hypothesized the 
relevance of intrinsic motivation (Amabile, 
1983, 1996; Crutchfield, 1962; Golann, 
1962), need for order (Barron, 1963), need 
for achievement (McClelland, Atkinson, 


Amabile (1983, 1996; Hennessey & Ama- 
bile, 1988) and her colleagues conducted 
seminal research on intrinsic and extrinsic 
motivation. Studies using motivational train- 
ing and other techniques have manipulated 
these motivations and observed effects on 
creative performance tasks, such as writing 
poems and making collages. 

Finally, the relevance of the social envi- 
ronment to creativity has also been an active 
area of research. At the societal level, Simon- 
ton (1984, 1988, 1994, 1999) conducted nu- 
merous studies in which eminent levels of 
creativity over large spans of time in diverse 
cultures have been statistically linked to en- 
vironmental variables. These variables in- 
clude, among others, cultural diversity, war, 
availability of role models, availability of re- 
sources (e.g., financial support), and number 
of competitors in a domain. Cross-cultural 
comparisons (e.g., Lubart, 1990) and anthro- 
pological case studies (e.g., Maduro, 1976; 
Silver, 1981) have demonstrated cultural 
variability in the expression of creativity. 
Moreover, they have shown that cultures dif- 
fer simply in the amount that they value the 
creative enterprise. 

The social-cognitive and social-persona- 
lity approaches have each provided valuable 
insights into creativity. However, if you look 
for research that investigates both social- 
cognitive and social-personality variables at 
the same time, you would find only a hand- 
ful of studies. The cognitive work on cre- 
ativity has tended to ignore the personality 
and social system, and the social-personality 
approaches tended to have little or nothing 
to say about the mental representations and 
processes underlying creativity. 

Looking beyond the field of psychology, 
Wehner, Csikszentmihalyi, and Magyari- 
Beck (1991) examined 100 more recent doc- 
toral dissertations on creativity. They found 
a “parochial isolation” of the various stud- 
ies concerning creativity. There were rele- 
vant dissertations from psychology, educa- 
tion, business, history, history of science, and 
other fields, such as sociology and political 
science. However, the different fields tended 
to use different terms and focus on different 
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basic phenomenon. For example, business 
dissertations used the term “innovation” and 
tended to look at the organizational level, 
whereas psychology dissertations used the 
term “creativity” and looked at the level of 
the individual. Wehner, Csikszentmihalyi, 
and Magyari-Beck (1991) described the sit- 
uation with creativity research in terms of 
the fable of the blind men and the elephant. 
“We touch different parts of the same beast 
and derive distorted pictures of the whole 
from what we know: “The elephant is like a 
snake,’ says the one who only holds its tail; 
‘The elephant is like a wall,’ says the one 
who touches its flanks” (p. 270). 


Evolutionary Approaches to Creativity 


The evolutionary approach to creativity was 
instigated by Donald Campbell (1960), who 
suggested that the same kinds of mecha- 
nisms that have been applied to the study 
of the evolution of organisms could be ap- 
plied to the evolution of ideas. This idea 
has been enthusiastically picked up by a 
number of investigators (Simonton, 1995, 
1998, 1999). 

The basic idea underlying this approach 
is that there are two basic steps in the gen- 
eration and propagation of creative ideas. 
The first is blind variation, by which the cre- 
ator generates an idea without any real idea 
of whether the idea will be successful (se- 
lected for) in the world of ideas. Indeed, 
Dean Simonton (1996) argued that creators 
do not have the slightest idea as to which 
of their ideas will succeed. As a result, their 
best bet for producing lasting ideas is to go 
for a large quantity of ideas. The reason is 
that their hit rate remains relatively constant 
through their professional life span. In other 
words, they have a fixed proportion of ideas 
that will succeed. The more ideas they have 
in all, the more ideas they have that will 
achieve success. 

The second step is selective retention. In 
this step, the field in which the creator works 
either retains the idea for the future or lets 
it die out. Those ideas that are selectively 
retained are the ones that are judged to be 


process, as well as blind generation, are de- 
scribed further by Cziko (1998). 

Does an evolutionary model really ade- 
quately describe creativity? Robert Stern- 
berg (1997, 2003) argued that it does not, 
and David Perkins (1998) also had doubts. 
Sternberg argued that it seems utterly im- 
plausible that great creators such as Mozart, 
Einstein, or Picasso were using nothing more 
than blind variation to come up with their 
ideas. Good creators, like experts of any 
kind, may or may not have more ideas than 
other people have, but they have better 
ideas, ones that are more likely to be se- 
lectively retained. The reason they are more 
likely to be selectively retained is that they 
were not produced in a blind fashion. This 
debate is by no means resolved, however, 
and is likely to continue into the future for 
some time to come. 

Perkins (1995, 1998) argued that the 
analogy between biological evolution and 
creativity is oversimplified. In particular 
(Perkins, 1998), biological evolution relies 
on massive parallel search for mutations 
(millions of bacteria, for example, are mutat- 
ing every second), whereas humans do not. 
At the same time, humans can do fairly ex- 
tensive searches, such as when they seek out 
new antibiotics. 

Were it the case that an understanding 
of creativity required a multidisciplinary ap- 
proach, the result of a unidisciplinary ap- 
proach might be that we would view a part 
of the whole as the whole. At the same time, 
though, we would have an incomplete ex- 
planation of the phenomenon we are seek- 
ing to explain, leaving dissatisfied those who 
do not subscribe to the particular discipline 
doing the explaining. We believe that tradi- 
tionally this has been the case for creativ- 
ity. More recently, theorists have begun to 
develop confluence approaches to creativity, 
which we now discuss. 


Confluence Approaches to the Study 
of Creativity 


Many more recent works on creativity hy- 
pothesize that multiple components must 
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1983; Csikszentmihalyi, 1988; Gardner, 
1993; Gruber, 1989; Gruber & Wallace, 
1999; Lubart, 1994, 1999; Lubart, Mouchi- 
roud, Tordjman, & Zenasni, 2003; Mumford 
& Gustafson, 1988; Perkins, 1981; Simonton, 
1988; Sternberg, 1985b; Sternberg & 
Lubart, 1991, 1995, 1996; Weisberg, 1993; 
Woodman & Schoenfeldt, 1989). Sternberg 
(1985b), for example, examined laypersons’ 
and experts’ conceptions of the creative 
person. People’s implicit theories contain 
a combination of cognitive and personality 
elements, such as “connects ideas,” “sees 
similarities and differences,” “has flexibility,” 
“has aesthetic taste,” “is unorthodox,” “is 
motivated,” “is inquisitive,” and “questions 
societal norms.” 

At the level of explicit theories, Amabile 
(1983, 1996; Collins & Amabile, 1999) de- 
scribed creativity as the confluence of intrin- 
sic motivation, domain-relevant knowledge 
and abilities, and creativity-relevant skills. 
The creativity-relevant skills include 


1. acognitive style that involves coping with 
complexities and breaking one’s mental 
set during problem solving; 

2. knowledge of heuristics for generating 
novel ideas, such as trying a counterintu- 
itive approach; and 

3. a work style characterized by concen- 
trated effort, an ability to set aside prob- 
lems, and high energy. 


Gruber (1981, 1989) and Gruber and 
Davis (1988) proposed a developmental 
evolving-systems model for understanding 
creativity. A person’s knowledge, purpose, 
and affect grow over time, amplify devi- 
ations that an individual encounters, and 
lead to creative products. Developmen- 
tal changes in the knowledge system have 
been documented in cases such as Charles 
Darwin’s thoughts on evolution. Purpose 
refers to a set of interrelated goals, which 
also develop and guide an individual’s behay- 
ior. Finally, the affect or mood system notes 
the influence of joy or frustration on the 
projects undertaken. 

Csikszentmihalyi (1988, 1996; Feldman, 
Csikszentmihalyi, & Gardner, 1994) took 


lighted the interaction of the individual, do- 
main, and field. An individual draws upon 
information in a domain and transforms or 
extends it via cognitive processes, person- 
ality traits, and motivation. The field, con- 
sisting of people who control or influence a 
domain (e.g., art critics and gallery owners), 
evaluates and selects new ideas. The domain, 
a culturally defined symbol system such as 
alphabetic writing, mathematical notation, 
or musical notation, preserves and transmits 
creative products to other individuals and 
future generations. Gardner (1993; see also 
Policastro & Gardner, 1999) conducted case 
studies that suggest that the development of 
creative projects may stem from an anomaly 
within a system (e.g., tension between com- 
peting critics in a field) or moderate asyn- 
chronies between the individual, domain, 
and field (e.g., unusual individual talent for 
a domain). In particular, Gardner (1993) an- 
alyzed the lives of seven individuals who 
made highly creative contributions in the 
twentieth century with each specializing in 
one of the multiple intelligences (Gardner, 
1983): Sigmund Freud (intrapersonal), Al- 
bert Einstein (logical-mathematical), Pablo 
Picasso (spatial), Igor Stravinsky (musical), 
T. S. Eliot (linguistic), Martha Graham 
(bodily-kinesthetic), and Mohandas Gandhi 
(interpersonal). Charles Darwin would be 
an example of someone with extremely 
high naturalist intelligence. Gardner pointed 
out, however, that most of these individ- 
uals actually had strengths in more than 
one intelligence and that they also had no- 
table weaknesses in others (eg., Freud’s 
weaknesses may have been in spatial and 
musical intelligences). 

Although creativity can be understood in 
terms of uses of the multiple intelligences 
to generate new and even revolutionary 
ideas, Gardner’s (1993) analysis goes well be- 
yond the intellectual. For example, Gardner 
pointed out two major themes in the behav- 
ior of these creative giants. First, they tended 
to have a matrix of support at the time 
of their creative breakthroughs. Second, 
they tended to drive a “Faustian bargain,” 
whereby they gave up many of the plea- 
sures people typically enjoy in life to attain 
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ever, it is not clear that these attributes are 
intrinsic to creativity, per se; rather, they 
seem to be associated with those who have 
been driven to exploit their creative gifts in 
a way that leads them to attain eminence. 

Gardner (1993) further followed Csik- 
szentmihalyi (1988, 1996) in distinguishing 
between the importance of the domain (the 
body of knowledge about a particular sub- 
ject area) and the field (the context in which 
this body of knowledge is studied and elab- 
orated, including the persons working with 
the domain, such as critics, publishers, and 
other “gatekeepers”). Both are important to 
the development, and, ultimately, the recog- 
nition of creativity. 

A final confluence theory considered here 
is Sternberg and Lubart’s (1991, 1995) invest- 
ment theory of creativity. According to this 
theory, creative people are ones who are will- 
ing and able to “buy low and sell high” in 
the realm of ideas (see also Lubart & Runco, 
1999; Rubenson & Runco, 1992, for use 
of concepts from economic theory). Buying 
low means pursuing ideas that are unknown 
or out of favor but that have growth poten- 
tial. Often, when these ideas are first pre- 
sented, they encounter resistance. The cre- 
ative individual persists in the face of this 
resistance, and eventually sells high, moving 
on to the next new or unpopular idea. 

Preliminary research within the invest- 
ment framework has yielded support for this 
model (Lubart & Sternberg, 1995). This re- 
search has used tasks such as 


1. writing short stories using unusual titles 
(e.g., “the octopus’ sneakers”), 


2. drawing pictures with unusual themes 
(e.g., “the earth from an insect’s point of 
view”), 

3. devising creative advertisements for bor- 
ing products (e.g., cufflinks), and 

4. solving unusual scientific problems (e.g., 
how we could tell if someone had been 
on the moon within the past month?). 


This research showed creative performance 
to be moderately domain specific and to be 
predicted by a combination of six distinct 
but interrelated resources: intellectual abili- 


ity, motivation, and environment. 

Concerning the confluence of compo- 
nents, creativity is hypothesized to involve 
more than a simple sum of a person’s 
level on each component. First, there may 
be thresholds for some components (eg., 
knowledge), below which creativity is not 
possible regardless of the levels on other 
components. Second, partial compensation 
may occur in which a strength on one 
component (e.g., motivation) counteracts a 
weakness on another component (e.g., envi- 
ronment). Third, interactions may also oc- 
cur between components, such as intelli- 
gence and motivation, in which high levels 
on both components could multiplicatively 
enhance creativity. 

In general, confluence theories of creativ- 
ity offer the possibility of accounting for di- 
verse aspects of creativity (Lubart, 1994). 
For example, analyses of scientific and artis- 
tic achievements suggest that the median- 
rated creativity of work in a domain tends to 
fall toward the lower end of the distribution 
and the upper — high creativity — tail extends 
quite far. This pattern can be explained 
through the need for multiple components 
of creativity to co-occur in order for the 
highest levels of creativity to be achieved. As 
another example, the partial domain speci- 
ficity of creativity that is often observed 
can be explained through the mixture of 
some relatively domain-specific components 
for creativity, such as knowledge, and other 
more domain-general components, such as, 
perhaps, the personality trait of persever- 
ance. Creativity, then, is largely something 
that people show in a particular domain. 


Alternate Approaches to 
Understanding Kinds of Creative 
Contributions 


Generally, we think of creative contribu- 
tions as being of a single kind. However, 
a number of researchers on creativity have 
questioned this assumption. There are many 
ways of distinguishing among types of cre- 
ative contributions. It is important to re- 
member, though, that creative contributions 
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times. At a given time, the field can never 
be sure of whose work will withstand the 
judgments of the field over time (e.g., that 
of Mozart) and whose work will not (e.g., 
that of Salieri) (Therivel, 1999). 

Theorists of creativity and related top- 
ics have recognized that there are differ 
ent types of creative contributions (see re- 
views in Ochse, 1990; Sternberg, 1988c; 
Weisberg, 1993). For example, Kuhn (1970) 
distinguished between normal and revolu- 
tionary science. Normal science expands 
upon or otherwise elaborates upon an al- 
ready existing paradigm of scientific re- 
search, whereas revolutionary science pro- 
poses a new paradigm (see Dunbar & Fugel- 
sang, Chap. 29). The same kind of distinction 
can be applied to the arts and letters. 

Gardner (1993, 1994) also described dif- 
ferent types of creative contributions indi- 
viduals can make. They include 


1. the solution of a well-defined problem, 
. the devising of an encompassing theory, 
. the creation of a “frozen work,” 


KW N 


. the performance of a ritualized work, 
and 

5. a “high-stakes” performance. Each type of 

creativity has as its result a different kind 

of creative product. 


Other bases for distinguishing among 
types of creative contributions also exist. For 
example, psychoeconomic models such as 
those of Rubenson and Runco (1992) and 
Sternberg and Lubart (1991, 1995, 1996) can 
distinguish different types of contributions 
in terms of the parameters of the models. 
In the Sternberg—Lubart model, contribu- 
tions might differ in the extent to which they 
“defy the crowd” or in the extent to which 
they redefine how a field perceives a set 
of problems. 

Simonton’s (1997) model of creativity 
also proposes parameters of creativity, and 
various kinds of creative contributions might 
be seen as differing in terms of the extent 
to which they vary from other contributions 
and the extent to which they are selected for 
recognition by a field of endeavor (see also 
Campbell, 1960; Perkins, 1995; Simonton, 


els intended explicitly to distinguish among 
types of creative contributions. 

Maslow (1967) distinguished more gener- 
ally between two types of creativity, which 
he referred to as primary and secondary. 
Primary creativity is the kind of creativity 
a person uses to become self-actualized — 
to find fulfillment in him- or herself and 
his or her life. Secondary creativity is the 
kind of creativity with which scholars in 
the field are more familiar — the kind that 
leads to creative achievements recognized by 
a field. 

Ward, Smith, and Finke (1999) noted that 
there is evidence to favor the roles of both fo- 
cusing (Bowers et al., 1990; Kaplan & Simon, 
1990) and exploratory thinking (Bransford 
& Stein, 1984; Getzels & Csikszentmiha- 
lyi, 1976) on creative thinking. In focusing, 
one concentrates on pursuing a single 
problem-solving approach, whereas in ex- 
ploratory thinking one considers many such 
approaches. A second distinction made by 
Ward and his colleagues is between domain 
specific (Clement, 1989; Langley, Simon, 
Bradshaw, & Zytkow, 1987; Perkins, 1981; 
Weisberg, 1986) and universal (Finke, 1990, 
1995; Guilford, 1968; Koestler, 1964) cre- 
ativity skills. Finally, Ward and his colleagues 
distinguish between unstructured (Bateson, 
1979; Findlay & Lumsden, 1988; Johnson- 
Laird, 1988) and structured or systematic 
(Perkins, 1981; Ward, 1994; Weisberg, 1986) 
creativity, where the former is displayed in 
systems with relatively few rules, and the lat- 
ter, in systems with many rules. 

There are tens of thousands of artists, mu- 
sicians, writers, scientists, and inventors to- 
day. What makes some of them stand out 
from the rest? Why will some of them be- 
come distinguished contributors in the an- 
nals of their field and others be forgotten? 
Although many variables may contribute to 
who stands out from the crowd, certainly 
creativity is one of them. The standouts 
are often those who are doing particu- 
larly creative work in their line of profes- 
sional pursuit. Are these highly creative in- 
dividuals simply doing more highly creative 
work than their less visible counterparts, or 
does the creativity of their work also differ 
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contributors make different decisions regard- 
ing how to express their creativity. This 
section describes a propulsion theory of 
creative contributions (Sternberg, 1999b; 
Sternberg, Kaufman, & Pretz, 2002) that 
addresses this issue of how people decide 
to invest their creative resources. The ba- 
sic idea is that creativity can be of dif- 
ferent kinds, depending on how it pro- 
pels existing ideas forward. When devel- 
oping creativity in children, we can foster 
different kinds of creativity, ranging from 
minor replications to major redirections in 
their thinking. 

Creative contributions differ not only in 
their amounts but also in the types of creativ- 
ity they represent. For example, both Sig- 
mund Freud and Anna Freud were highly 
creative psychologists, but the nature of 
their contributions seems in some way or 
ways to have been different. Sigmund Freud 
proposed a radically new theory of human 
thought and motivation, and Anna Freud 
largely elaborated on and modified Sigmund 
Freud’s theory. How do creative contribu- 
tions differ in quality and not just in quantity 
of creativity? 

The type of creativity exhibited in a cre- 
ator’s works can have at least as much of an 
effect on judgments about that person and 
his or her work as does the amount of cre- 
ativity exhibited. In many instances, it may 
have more of an effect on these judgments. 

Given the importance of purpose, cre- 
ative contributions must always be defined 
in some context. If the creativity of an in- 
dividual is judged in a context, then it will 
help to understand how the context interacts 
with how people are judged. In particular, 
what are the types of creative contributions 
a person can make within a given context? 
Most theories of creativity concentrate on 
attributes of the individual (see Sternberg, 
1999b). However, to the extent that creativ- 
ity depends on the interaction of person with 
context, we would also need to concentrate 
on the attributes of the individual and the in- 
dividual’s work relative to the environmen- 
tal context. 

A taxonomy of creative contributions 
needs to deal with the question not just of in 


of what the type of creative contribution is. 
What makes one work in biology more cre- 
ative or creative in a different way from an- 
other work in biology, or what makes its cre- 
ative contribution different from that of a 
work in art? Thus, a taxonomy of domains 
of work is insufficient to elucidate the nature 
of creative contributions. A field needs a ba- 
sis for scaling how creative contributions dif- 
fer quantitatively and, possibly, qualitatively. 
For instance, 


1. Replication. The contribution is an at- 
tempt to show that the field is in the 
right place. The propulsion keeps the field 
where it is rather than moving it. This type 
of creativity is represented by stationary 
motion, as of a wheel that is moving but 
staying in place. 

2. Redefinition. The contribution is an at- 
tempt to redefine where the field is. The 
current status of the field thus is seen from 
different points of view. The propulsion 
leads to circular motion such that the cre- 
ative work leads back to where the field is 
but as viewed in a different way. 


3. Forward Incrementation. The contribution 
is an attempt to move the field forward 
in the direction it already is going. The 
propulsion leads to forward motion. 


4. Advance Forward Incrementation. The 
contribution is an attempt to move the 
field forward in the direction it is al- 
ready going, but by moving beyond where 
others are ready for it to go. The propul- 
sion leads to forward motion that is accel- 
erated beyond the expected rate of for- 
ward progression. 

5. Redirection. The contribution is an at- 
tempt to redirect the field from where it is 
toward a different direction. The propul- 
sion thus leads to motion in a direction 
that diverges from the way the field is cur- 
rently moving. 

6. Reconstruction/Redirection. The contribu- 
tion is an attempt to move the field back 
to where it once was (a reconstruction of 
the past) so it may move onward from that 
point, but in a direction different from 
the one it took from that point onward. 
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is backward and then redirective. 


7. Reinitiation. The contribution is an at- 
tempt to move the field to a different as 
yet unreached starting point and then to 
move from that point. The propulsion is 
thus from a new starting point in a direc- 
tion that is different from that the field 
previously has pursued. 


8. Integration. The contribution is an at- 
tempt to integrate two formerly diverse 
ways of thinking about phenomena into 
a single way of thinking about a phe- 
nomenon. The propulsion thus is a com- 
bination of two different approaches that 
are linked together. 


The eight types of creative contributions 
described previously are largely qualitatively 
distinct. Within each type, however, there 
can be quantitative differences. For exam- 
ple, a forward incrementation can represent 
a fairly small step forward or a substan- 
tial leap. An initiation can restart a subfield 
(eg., the work of Leon Festinger on cogni- 
tive dissonance) or an entire field (e.g., the 
work of Einstein on relativity theory). Thus, 
the theory distinguishes contributions both 
qualitatively and quantitatively. 


Conclusions and Future Directions 


In sum, creativity, which has often been 
viewed as beyond study, is anything but. 
Creativity can be understood about as well as 
any psychological construct, if appropriate 
methods are brought to bear upon its inves- 
tigations. The history of creativity theory and 
research is long and interesting. It represents 
a diversity of attempts to understand the 
phenomenon. More recently, scholars have 
recognized that creativity can be of multi- 
ple kinds and have tried to understand these 
different kinds. A full account of creativity 
would need to take into account not just 
differing amounts of creativity but differing 
kinds. These kinds would include creativ- 
ity that accepts current paradigms, creativity 
that rejects them, and creativity that synthe- 
sizes them into a new whole. 
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CHAPTER 16 


Complex Declarative Learning 


Michelene T. H. Chi 
Stellan Ohlsson 


Introduction 


How do people acquire a complex body 
of knowledge, such as the history of the 
Panama Canal, the structure of the so- 
lar system, or the explanation for how 
the human circulatory system works? Com- 
plex learning takes longer than a few min- 
utes and requires processes that are more 
complicated than the associative processes 
needed to memorize pairs of words. The 
materials that support complex learning — 
such as texts, illustrations, practice prob- 
lems, and instructor feedback presented in 
classrooms and elsewhere — are often dif- 
ficult to understand and might require ex- 
tensive processing. For example, learning 
about the human circulatory system requires 
many component processes, such as inte- 
grating information from several sources, 
generating inferences, connecting new infor 
mation with existing knowledge, retrieving 
appropriate analogies, producing explana- 
tions, coordinating different representations 
and perspectives, abandoning or rejecting 
prior concepts that are no longer useful, and 


so forth. Many of these component processes 
are still poorly understood so we have even 
less understanding of the complex process of 
learning a large body of knowledge. 
Complex knowledge can be partitioned 
into two types: declarative knowledge and 
procedural knowledge (see Lovett & Ander- 
son, Chap. 17). Declarative knowledge has 
traditionally been defined as knowledge of 
facts or knowing that, whereas procedural 
knowledge is knowing how (Anderson, 1976; 
Winograd, 1975). Declarative knowledge is 
descriptive and use independent. It em- 
bodies concepts, principles, ideas, schemas, 
and theories (Ohlsson, 1994, 1996). Exam- 
ples of declarative knowledge are the laws 
of the number system, Darwin’s theory of 
evolution, and the history of the Panama 
Canal. The sum total of a person’s declar- 
ative knowledge is his or her understanding 
of the way the world, or some part or aspect 
of the world, works, independently of the 
particular tasks the person undertakes. 
Procedural knowledge, such as how to op- 
erate and troubleshoot a machine, how to 
solve a physics problem, or how to use a 
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specific. It consists of associations between 
goals, situations, and actions. Research in 
cognitive neuroscience supports the reality 
of this distinction between declarative and 
procedural knowledge (Squire, 1987). 

The acquisition of complex procedural 
knowledge has been extensively investigated 
in laboratory studies of skill acquisition, 
problem solving, and expertise (Ericsson, 
1996; Feltovich, Ford, & Hoffman, 1997; see 
Novick & Bassok, Chap. 14), and in field 
studies of practitioners (Hutchins, 1995; 
Keller & Keller, 1996). Issues that have 
been explored include the role of percep- 
tual organization in expert decision mak- 
ing, the breakdown of goals into subgoals, 
the effect of ill-defined goals, the nature of 
search strategies, choices between compet- 
ing strategies, the conditions of transfer of 
problem-solving strategies from one prob- 
lem context to another, the effect of alter 
native problem representations, the role of 
collaboration in complex tasks, and so on. 
As is obvious in this chapter, the issues rel- 
evant to the study of complex procedural 
learning are different from those relevant 
to the study of complex declarative learn- 
ing. Because the acquisition of procedural 
knowledge has been researched so exten- 
sively in the past few decades, there are 
several recent reviews (Lovett, 2002; Van- 
Lehn, 1989; see Novick & Bassok, Chap. 14). 
Therefore, this chapter focuses primarily 
on the acquisition of a body of declara- 
tive knowledge. 

The study of complex declarative learning 
is still in its infancy and has not yet produced 
a unified theory or paradigmatic framework. 
The organization of this chapter is meant 
to suggest one form that such a framework 
might take. In the first section, we describe 
basic characteristics of complex declarative 
knowledge. In the second section, we clas- 
sify the different types of changes that oc- 
cur in declarative knowledge as one learns. 
This classification is the main contribution 
of the chapter. The third section is a brief 
treatment of the so-called learning paradox 
(Bereiter, 1985). We end with a few conclud- 
ing remarks. 


Knowledge 


Size of Knowledge Base 


The most basic observation one can make 
about declarative knowledge is that humans 
have a lot of it. There are no precise esti- 
mates of the amount of knowledge a person 
possesses, but two attempts at an estimate 
seem well grounded. The first is an esti- 
mate of the size of the mental lexicon. The 
average college-educated adult knows be- 
tween 40,000 and 60,000 words (Miller, 
1996, pp. 136-138). The total number of 
words in the English language is larger than 
100,000. Because concepts only constitute a 
subset of declarative knowledge, this repre- 
sents a lower bound on the size of a person’s 
declarative knowledge base. Second, Lan- 
dauer (1986) estimated how much informa- 
tion, measured in bits, people can remem- 
ber from a lifetime of learning. His estimate 
is 2 x 10*9 bits by age 70. It is not straight- 
forward to convert bits to concepts or pieces 
of knowledge, but even very fast comput- 
ers use only 32 or 64 bits to encode one 
basic instruction. If we make the conserva- 
tive assumption that it requires 1000 bits to 
encode one piece of knowledge, Landauer’s 
estimate implies that a person’s declarative 
knowledge base eventually approximates 
1 million pieces of knowledge. 

These estimates apply to the size of the 
knowledge base as a whole. At the level of 
individual domains, estimates of the size of 
domain-specific knowledge bases tend to re- 
sult in numbers that are comparable to es- 
timates of the mental lexicon. For exam- 
ple, Simon and Gilmartin (1973) estimated 
the number of chess piece configurations — 
chunks or patterns — known by master play- 
ers to be between 10,000 and 100,000. We 
do not know whether this is a coincidence 
or a symptom of some deeper regularity. 

In short, even without a precise definition 
of what is to count as a unit of knowledge, 
the average person’s declarative knowledge 
base must be measured in tens of thousands, 
or more likely hundreds of thousands, of 
units. How all this knowledge — the raw 
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quired is clearly a nontrivial, but under- 
researched, question. 


Organization 


Knowledge does not grow as a set of iso- 
lated units but in some organized fashion. 
To capture the organization of the learners’ 
declarative knowledge, cognitive scientists 
operate with three distinct representational 
constructs: semantic networks, theories, and 
schemas (Markman, 1999). 

The key claim behind semantic networks 
is that a person’s declarative knowledge base 
can be thought of as a gigantic set of nodes 
(concepts) connected by links (relations). 
All knowledge is interrelated, and cognitive 
processes, such as retrieval and inferencing, 
operate by traversing the links. Early com- 
puter simulations of long-term memory for 
declarative knowledge explored variants of 
this network concept (Abelson, 1973; An- 
derson & Bower, 1973; Norman & Rumel- 
hart, 1975; Quillian, 1968; Schank, 1972; see 
Medin & Rips, Chap. 3). 

Because the distance between two nodes 
in a semantic network is determined by 
the number of relations one must traverse 
to reach from one to the other, semantic 
networks implicitly claim that declarative 
knowledge is grouped by domain. We use 
the term “domain” to refer to both infor 
mal areas of knowledge, such as home dec- 
orating, eating at a restaurant, and watch- 
ing sports, and formal disciplines, such as 
botany, linguistics, and physics. Pieces of 
knowledge that belong to the same domain 
are similar in meaning and therefore clus- 
ter together functionally. Consistent with 
this notion, membership in the same domain 
tends to produce higher similarity ratings, 
stronger priming effects, and other quanti- 
tative behavioral consequences; descriptions 
of these well-known effects can be found 
in textbooks in cognitive psychology (e.g., 
Ashcraft, 2002; Reisberg, 2001). 

The structure of any domain representa- 
tion depends on the dominant relations of 
that domain. If the dominant relation is set 
inclusion, the representation is organized as 


imals and plants are prototypical examples. 
In contrast, relations such as cause-effect and 
before—-after produce chain-like structures. In 
general, the representations of domains are 
locally structured by their dominant relations. 

The semantic network idea claims that 
all knowledge is interrelated, but it does 
not propose any single, overarching struc- 
ture for the network as a whole. Concepts 
and assertions are components of domains, 
but domains are not components of a yet 
higher level of organization. Domains relate 
to each other in a contingent rather than 
systematic way. Informal observations sup- 
port this notion. We have one concept hi- 
erarchy for tools and another for furniture, 
but the node lamp appears in both. Home 
decorating is not a subset of cooking, or vice 
versa, but the two share the kitchen. The con- 
cept of tangled hierarchies (Hofstadter, 1999) 
describes one aspect of local, unsystematic 
contact points between internally structured 
domains. These comments are somewhat 
speculative because there is little cognitive 
research aimed at elucidating the structure 
of the declarative knowledge base as a whole. 

Domains can also be represented as the- 
ories. Theories are “deep” representations 
(borrowing a term from social psychologists, 
see Rokeach, 1970) in the sense of having 
well-articulated center-periphery structures. 
That is, a theory is organized around a small 
set of core concepts or principles — big ideas — 
on which the rest of the elements in the 
domain are dependent. The core knowl- 
edge elements are typically fundamental and 
abstract, whereas the peripheral ones are 
based on, derived from, or instances of the 
core ones. The most pristine examples of 
center-periphery structures are the formal 
axiomatic systems of mathematics and logic 
in which a small set of chosen axioms pro- 
vide a basis for the proofs of all other the- 
orems in a particular formal theory, and 
natural science theories, such as Newton’s 
theory of mechanical motion, Darwin’s the- 
ory of biological evolution, and the atomic 
theory of chemical reactions. These theories 
are obviously experts’ and novices’ represen- 
tations of those same domains and may or 
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ing that change in structure is one dimension 
of complex learning. For example, DiSessa 
(1988, 1993) argued that novice knowledge 
of mechanical motion is not theory-like at 
all but is better thought of as an irregular 
collection of fragments (see Smith, DiSessa, 
& Roschelle, 1995, for a modified version of 
this view). 

Other cognitive scientists, however, pre- 
fer to represent the novices’ understandings 
of the natural world as intuitive theories in 
deliberate analogy with the explicit and cod- 
ified theories of scientists and mathemati- 
cians (Gopnik & Meltzoff, 1997; Gopnik & 
Wellman, 1994; McCloskey, 1983; Wiser & 
Carey, 1983). By referring to someone’s 
naive representation as a theory, one implies 
specifically that the representation shares 
certain characteristics with explicit theories; 
most prominently that it has a center- 
periphery structure." 

A well-developed center-periphery struc- 
ture is often the hallmark of an expert’s rep- 
resentation of a domain, and a comparison 
between novices’ and experts’ representa- 
tions of the same domain often reveals dif- 
ferences in the “depth” of their represen- 
tations. However, one can raise the ques- 
tion of whether “depth” should also be con- 
strued as a characteristic of the domain it- 
self. That is, are some domains intrinsically 
“deep” whereas others not, so that a center- 
periphery structure is not an appropriate 
representation for some domains? If so, we 
would expect neither experts nor novices 
to construct “deep” representations of those 
domains. For example, in informal everyday 
domains such as home decorating or eating at 
a restaurant, the center-periphery structure 
is certainly less salient. (However, even if an 
everyday domain such as entertaining might 
not have a principled theory, its subdomain 
of formal table setting does; Bykofsky & Far- 
gis, 1995, pp. 144-146; Tuckerman & Dun- 
nan, 1995, pp. 176-177.) Moreover, even for 
informal domains such as cooking for which 
we as novices might claim to lack deep prin- 
ciples, many professional chefs would dis- 
agree. Thus, to what extent is the pervasive 
striving for a center-periphery structure with 


sentation, and to what extent is it an adap- 
tation to the objective structure of domains, 
remains an open question. 

The network concept codifies the intu- 
ition that everything is related to everything 
else, and the theory concept codifies the in- 
tuition that some knowledge elements are 
more important than others. The concept 
of a schema, however, codifies the intuition 
that much of our declarative knowledge 
represents recurring patterns in experience 
(see Holyoak, Chap. 6). Although the term 
“schema” has never been formally defined, 
the key strands in this construct are nev- 
ertheless clear. To a first approximation, a 
schema is a set of relations among a set of 
slots or attributes, where the slots can be 
thought of as variables that can take values 
within a specified range (Bobrow & Collins, 
1975; Brewer & Nakamura, 1984; Marshall, 
1995; Minsky, 1975; Norman & Rumelhart, 
1975; Thorndyke, 1984). Take the concept 
of “cousin” as an example. A cousin can be 
defined by a schema containing slots such as 
children, parents, and siblings along with a 
collection of relations such as parent-of and 
sibling-of: 


(cousin-ofy w) = def[(parent-of x y) 


x (sibling-of z x)(parent-ofz w)] (Eq.16.1) 
To say that a person understands that Steve 
(slot y) and Bob (slot w) are cousins is to say 
that he or she knows that Steve (slot y) is 
the son of Carl (slot x), Carl is the brother 
of John (slot z), and John is the father of 
Bob (slot w). The slots are associated with 
ranges of appropriate values. Being a child, 
Steve must be younger than Carl; thus, slot 
y might have an age range of 1 to 50 years 
old, and slot x might have an age range of 
21 to 85 years old. Similarly, slot y can have 
the values of being either a male (a son) or a 
female (a daughter). 

Schemas are bounded units of knowl- 
edge, and it is essential to their hypothe- 
sized function that they are retrieved or ac- 
tivated as units. That is, if one part of a 
schema (relation or slot) is activated, there is 
a high probability that the rest of the schema 
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abstract precisely because they represent 
recurring patterns in experience. Level of 
abstraction can vary (Ohlsson, 1993 a). 

There are many variants of the schema 
idea in the cognitive literature. In the clas- 
sic chess studies of deGroot (1965) and 
Chase and Simon (1973), chess experts were 
found to know by heart thousands of board 
patterns (each pattern consisting of a few 
chess pieces arranged in a meaningful con- 
figuration), and these familiar patterns al- 
tered their perception of the board to suggest 
promising moves. Similar findings regarding 
the power of perceptual patterns to influ- 
ence high-level cognition can be seen in a 
physician’s ability to read X-rays (Lesgold 
et al., 1988) and a fire fighter’s ability to 
size up a fire (Klein, 1998). Similarly, there 
is evidence to show that experts’ program- 
ming knowledge includes frame-like struc- 
tures called plans (Soloway & Erhlich, 1984), 
which are stereotypical situations that occur 
frequently in programming: looping, accu- 
mulating values, and so forth. These basic 
plans not only serve as the building blocks 
when writing programs, but they are also 
necessary for comprehension of programs. 
Scripts are higher-order knowledge struc- 
tures that represent people’s knowledge of 
informal or everyday events such as eating 
in a restaurant or visting the dentist’s office 
(Schank & Abelson, 1977). Explanation pat- 
terns are schemas for how to construct ex- 
planations of particular types (Kitcher, 1993; 
Ohlsson, 2002; Ohlsson & Hemmerich, 
1999; Schank, 1986). Yet other schema-like 
constructs have been proposed (e.g., Collins 
& Ferguson, 1993; Keegan, 1989; Machamer 
& Woody, 1992). Chunks, explanation pat- 
terns, frames, plans, and scripts are vari- 
ants of the basic idea that much declarative 
knowledge consists of representations of re- 
curring patterns. For simplicity, we use the 
term schema throughout this chapter to refer 
to all these constructs. 

Although the three constructs of net- 
works, theories, and schemas appear side by 
side in the cognitive literature, the relations 
between them are unclear. First, it is not 
clear how a schema should be understood 


work. For a schema to be a distinct represen- 
tational entity, there has to be a well-defined 
boundary between the schema and the rest 
of the knowledge network. (If not, activa- 
tion will spread evenly across the nodes and 
links in the schema and the nodes and links 
that are not in the schema, which contra- 
dicts the central claim of schema theory 
that the probability of spreading from one 
node within the schema to another node 
within the schema is higher than spread- 
ing to a node outside the schema.) How- 
ever, the concept of a network does not 
provide any obvious way to explain what 
would constitute such a boundary other than 
to assume that links among nodes within a 
schema are more strongly connected than 
links among nodes between schemas (Chi & 
Ceci, 1987; Rumelhart, Smolensky, McClel- 
land, & Hinton, 1986). The differentiation 
in the strength of linkages can create clusters 
that can be conceived of as schemas (Chi & 
Koeske, 1983). 

The relations between a schema and a 
theory are equally unclear. One can con- 
ceptualize a schema as a tool for organizing 
information, but it is not obvious whether 
a schema makes assertions or claims about 
the world. In this conception, schemas are 
not theories, but people obviously have the- 
ories. Finally, any explication of the rela- 
tion between networks and theories must 
specify how the center-periphery structure 
that is intrinsic to theories can be embedded 
within networks. 

In this chapter, we take the stance that 
networks, theories, and schemas are three 
partially overlapping but distinct theoretical 
constructs. Different aspects of the organi- 
zation of declarative knowledge are best un- 
derstood with the help of one or the other 
of these constructs, or with some mixture of 
the three. 

In summary, declarative knowledge bases 
are very large and they exhibit complex or 
ganization. The notion of semantic networks 
captures the fact that every part of a per- 
son’s knowledge is related, directly or indi- 
rectly, to every other part. Representations 
of particular domains vary in “depth,” that 
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ized by a central set of fundamental ideas 
or principles to which other, more periph- 
eral knowledge units are related. Declarative 
knowledge also represents recurring patterns 
in experience with schemas, small packets 
of abstract structural information that are 
retrieved as units and used to organize in- 
formation. These three types of organization 
cannot easily be reduced to each other, and 
explanations of change in complex knowl- 
edge draw upon one or the other of these 
constructs or on some mixture of the three. 


Types of Changes 


The purpose of this section is to describe 
different types of changes in the knowl- 
edge base as one learns a body of declara- 
tive knowledge. There exists no widely ac- 
cepted taxonomy of changes in a body of 
declarative knowledge. We chose to char- 
acterize changes as potentially occurring 
along seven dimensions. Presumably, differ- 
ent cognitive mechanisms are responsible for 
changes along different dimensions, but the 
field has not specified with any precision 
learning mechanisms for every dimension. 
In each section here, we specify a dimen- 
sion of change, summarize some relevant 
empirical evidence, and describe the cogni- 
tive processes and mechanisms, if any, that 
have been proposed to explain change along 
that dimension. 


Larger Size 


Cumulative growth in size is a basic di- 
mension of change in a body of declarative 
knowledge. Adults obviously know more 
about the world in general than do chil- 
dren (Chi, 1976), and thus children are of- 
ten referred to as universal novices (Brown 
& DeLoache, 1978). Similarly, experts ob- 
viously know more about their domains of 
expertise than novices (Chi, Glaser, & Farr, 
1988). People routinely accumulate addi- 
tional facts about the world from sources 
such as news programs, texts, pictures, and 
conversations. These sources present people 


not know before, and some of those facts 
are retained. The declarative knowledge base 
continues to grow in size throughout the life 
span, albeit perhaps at a slower rate as a 
person ages (Rosenzweig, 2001). Rumelhart 
and Norman (1978) referred to this type of 
cumulative addition of pieces of knowledge 
as accretion. 

For adults, cumulative acquisition of in- 
dividual pieces of knowledge — facts — must 
be pervasive and account for a large pro- 
portion of all learning. There is little mys- 
tery as to the processes of acquisition. People 
acquire them via perception and observa- 
tion, via comprehension of oral and written 
discourse, and via inductive (see Sloman & 
Lagnado, Chap. 5) and deductive (see Evans, 
Chap. 8) reasoning (i.e., by inferring new 
facts from prior knowledge, or by integrating 
new facts with old knowledge and making 
further inferences from the combination). 

A particularly interesting property of 
accretion is that it is self-strengthening. 
Many psychology studies have confirmed 
that what is encoded, comprehended, and 
inferred depends on the individual learner’s 
prior knowledge. For example, Spilich, 
Vesonder, Chiesi, and Voss (1979) pre- 
sented a passage describing a fictitious base- 
ball game. Not only was the amount of 
recall of the individuals with high prior 
baseball knowledge greater (suggesting that 
the information was properly encoded), but 
the pattern of recall also differed. The 
high-knowledge individuals recalled more 
information directly related to the goal 
structure of the game (Spilich et al., 1979) 
as well as the actions of the game and 
the related changes in the game states 
(Voss, Vesonder, & Spilich, 1980), whereas 
the low-knowledge individuals recalled the 
teams, the weather, and other less impor- 
tant events and confused the order of the 
actions. Moreover, high-knowledge individ- 
uals were better than low-knowledge indi- 
viduals at integrating a sequence of sentences 
(Chiesi, Spilich, & Voss, 1979, exp. V). In 
short, prior knowledge leads to more effec- 
tive accretion, which in turn generates more 
prior knowledge. 
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inference processes augment the knowledge 
base, they do not necessarily cause deep 
changes in prior knowledge. Consider once 
again a baseball fan reading a newspaper ar- 
ticle about a game. He or she will acquire 
facts that are obviously new — the score in 
the eighth inning cannot have been known 
before the game has been played — but the 
facts about past games are not altered, and he 
or she is unlikely to acquire a new and differ- 
ent conception of the game itself, although 
additional facts about baseball games per se 
may be acquired. The key characteristic that 
makes this an instance of accretion is that the 
learner already has a schema for a baseball 
game, which presumably has slots for the ba- 
sic actions (throwing the ball), the highest- 
level goal (winning the game), and other as- 
pects of the game (Soloway, 1978). Once 
that schema has been acquired, to become 
increasingly knowledgeable is largely to ac- 
quire more knowledge that fits into those 
slots, as well as knowledge of subgoals and 
relations between the basic actions and the 
goal (Means & Voss, 1985). Similarly, read- 
ers of narratives might acquire facts about 
some fictional events, but they are unlikely 
to change their conceptions of causality, 
time, or human motivation, arguably three 
central schemas in comprehending narra- 
tives (Graesser, Singer, & Trabasso, 1994; 
Kintsch, 1998). 

These observations imply that we need to 
distinguish between two levels of learning. 
Comprehension as normally understood re- 
sults in the construction of a specific instance 
of a schema or the accretion of schema- 
relevant facts. New information is assimi- 
lated to existing schemas. This is the basic 
mechanism of accretion. The size of the rel- 
evant declarative knowledge base increases 
without fundamental changes in structure. 

Deeper learning, however, results in some 
structural modification of the learner’s prior 
schema. The same distinction can easily be 
expressed within the other two theoretical 
frameworks that we use in this chapter. In 
network terms, accretion adds nodes and 
links without deleting or altering any prior 
ones, while deeper learning requires a reor- 


itive theories, cumulative growth might de- 
velop the relations between the core princi- 
ples and peripheral knowledge items, while 
deeper learning either develops the core 
principles or replaces or alters one or more of 
the core principles. We discuss deeper learn- 
ing processes later in this chapter. 


Denser Connectedness 


In network terms, connectedness can be de- 
fined as the density of relations between the 
knowledge elements. We would expect the 
density of connections in a representation to 
increase as the learner acquires more knowl- 
edge. This implication was supported by a 
study in which we compared the node-link 
representation of a single child’s knowledge 
of 20 familiar dinosaurs with his represen- 
tation of 20 less familiar dinosaurs (Chi & 
Koeske, 1983; Figures 16.1 and 16.2). The 
nodes and relations of the network were cap- 
tured from the child’s generation protocols 
of dinosaurs and their attributes. The repre- 
sentation of the 20 more familiar dinosaurs 
was better connected into meaningful clus- 
ters in that it had more links relating the di- 
nosaurs that belonged to the same family, 
as well as relating the dinosaurs with their 
attributes of diet and habitat. The repre- 
sentation of the 20 less familiar dinosaurs 
had fewer links within clusters, and thus 
the cluster were less densely connected, so 
they appear less differentiated and more dif- 
fused. In short, the better learned materi- 
als were more densely connected in an or- 
ganized way, even though, overall, the two 
networks represented the same number of 
nodes and links. 

A special case of connectedness is the 
mapping between layers. Layers can be de- 
fined in different ways in different domains. 
For example, in the context of computer 
programming we can conceive of the speci- 
fication (the goals) as the highest layer, and 
the implementation (the data structures and 
primitive actions of the program) as the low- 
est level. Designing and comprehending a 
program requires building a bridge between 
the specification and the implementation 
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Figure 16.1. A child’s representation of 20 familiar dinosaurs. (From Chi & Koeske, 1983.) 
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plementation to the specification through 
a series of layers. Expert programmers are 
skilled at linking high-level goals to specific 
segments of programming code, whereas 
less skilled programmers are more likely to 
link program goals to triggers like variable 
names (Pennington, 1987). Once again, we 
see that a person’s knowledge base appears 
to become more densely connected with in- 
creased knowledge acquisition. 

Another special case of connectedness 
is between the conditions (declarative 
knowledge) and the actions (procedural 
knowledge). For example, experienced and 
inexperienced pilots knew equivalent num- 
bers of facts, but the inexperienced pilots 
failed to apply them in the context of ac- 
tions (Stokes, Kemper, & Kite, 1997). One 
can interpret this to mean that the facts that 
the inexperienced pilots knew were not con- 
nected to their actions. 

Although the cited studies involved 
highly domain-specific relations, there are 
many types of connections that play cen- 
tral roles in declarative knowledge bases. 
For example, causal relations play a central 
role in the comprehension of narratives (see 
Buehner & Cheng, Chap. 7; Trabasso & van 
den Broek, 1985) and scientific theories (see 
Dunbar & Fugelsang, Chap. 29), and hierar- 
chical relations such as set-subset relations 
form the backbone of taxonomic or classi- 
ficatory knowledge structures (see Medin & 
Rips, Chap. 3). The general point is that, as 
knowledge acquisition proceeds in a domain, 
the learner’s representation of that domain 
will increase in connectedness in a meaning- 
ful way. 


Increased Consistency 


The consistency of a knowledge represen- 
tation refers to the degree to which the 
multiple assertions embedded in an intuitive 
theory can, in fact, be true at the same time. 
A person who claims that the Earth is round 
but who refuses to sail on the ocean for fear 
of falling over the edge is inconsistent in 
this sense. 


plored for decades in many areas of psy- 
chology, philosophy, and education. Social 
psychologists investigated the consistency of 
belief systems in the 1950s and 1960s (Abel- 
son et al., 1968; Festinger, 1962/1957; Fish- 
bein & Ajzen, 1975; Heider, 1944; McGuire, 
1968), and it remains an area of active re- 
search (Eagly & Chaiken, 1993; Harmon- 
Jones & Mills, 1999). In the wake of Thomas 
Kuhn’s influential book The Structure of Sci- 
entific Revolutions (Kuhn, 1970), the philo- 
sophical debate about theory change in sci- 
ence came to focus on how scientists react to 
inconsistencies (anomalies) between theory 
and data, and this perspective carried over 
into contemporary approaches to science ed- 
ucation (Hewson & Hewson, 1954; Posner, 
Strike, Hewson, & Gertzog, 1982; Strike & 
Posner, 1985). Education researchers were 
already primed for this focus by the tradi- 
tional concern in the Piagetian tradition with 
contradictions and inconsistencies as driv- 
ing forces for cognitive development (Piaget, 
1985). Unfortunately, the social, philosoph- 
ical, educational, and developmental liter- 
atures on cognitive consistency are not as 
tightly integrated as they ought to be in light 
of the nearly identical ideas that drive re- 
search in these fields. 

It is reasonably certain that people pre- 
fer consistent over inconsistent beliefs, at 
least locally, and that the discovery of lo- 
cal inconsistency (or conflict; Ames & Mur- 
ray, 1982) triggers cognitive processes that 
aim to restore consistency, just as Piaget, 
Festinger, Kuhn, and others have hypothe- 
sized. For example, Thagard (1989, 2000) 
explored a computational network model 
called ECHO in which consistency is de- 
fined as the lack of contradictions between 
assertions and hypotheses. ECHO has suc- 
cessfully predicted human data from a va- 
riety of situations, including the evaluation 
of scientific theories in light of data (Tha- 
gard, 1992a) and the outcome of court cases 
(Thagard, 1992b). 

However, the relation between experi- 
enced inconsistency and cognitive change 
is complex. Several investigators sugges- 
ted that conflict triggers efforts to restore 
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Figure 16.2. A child’s representation of 20 less familiar dinosaurs (From Chi & Koeske, 1983.) 


consistency only when the conflict is recog- 
nized by the learner him- or herself through 
reflection (Chi, 2000; Ohlsson, 1999; Strike 
& Posner, 1992). When learners are alerted 
to inconsistencies and conflicts by an ex- 
ternal source, they are more likely to ei- 
ther assimilate or dismiss them (Chinn & 
Brewer, 1993). Contradiction highlighted by 
an external source is likely to trigger change 
processes only if the learner is dissatisfied 
with his or her current conception (Posner 
et al., 1982). Furthermore, there are many 
ways to respond to inconsistency (Chinn 
& Brewer, 1993; Darden, 1992; Kelman & 
Baron, 1968), and not all modes of response 
increase consistency (as opposed to bypass- 
ing the problem); we return to this topic 
in “The Learning Paradox: Monotonic and 
Nonmonotonic Change.” 

Consistency should not be confused with 
veridicality. It is possible for a knowledge 
representation to be locally consistent and 
yet be inaccurate. For example, we have 


argued that the naive conception of the 
circulatory system as a single-loop system 
is flawed but nevertheless constrained by 
a consistent set of identifiable yet inaccu- 
rate principles. The learner can use such a 
flawed conception systematically to gener- 
ate incorrect explanations. (Chi, 2000). His- 
torically, the Ptolemian epicycle theory of 
the solar system was as internally consistent 
as the Keplerian theory, but obviously not 
as accurate. 

Consistency should also not be confused 
with level of expertise. A more knowl- 
edgeable person does not necessarily have 
a more consistent domain representation 
than someone who knows less. Ability to 
operate with inconsistency has often been 
proposed as a sign of intellectual sophistica- 
tion, whereas insistence on total consistency 
has long been associated with dogmatism 
and lack of intellectual flexibility (Ehrlich 
& Leed, 1969; Rokeach, 1960). A famous 
historic example is the resolution — or lack 
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between the wave and particle models of 
photons. These annoying entities insist on 
behaving as both waves and particles, and 
since the time of Niels Bohr physicists have 
been content to let them be that way. 

Consistency is sometimes used synony- 
mously with the term coherence, as in Tha- 
gard’s (1992a) use of the term explanatory co- 
herence to refer to the consistency between a 
hypothesis and evidence and other hypothe- 
ses. However, consistency is distinct from 
coherence in that, as a measure of a repre- 
sentation, coherence can be used to refer to 
the more well-defined connectedness in a se- 
mantic representation in which the notion of 
contradiction or conflict is not an issue (Chi 
& Koeske, 1983). There is not enough ev- 
idence or agreement about the concept of 
coherence to warrant discussing it as a sepa- 
rate dimension of change. 

In summary, increased consistency is an 
important type of change in a declarative 
knowledge base, but it is distinct from the 
concepts of higher veridicality, more ad- 
vanced knowledge, and coherence. 


Finer Grain of Representation 


Reality is not simple, and almost any aspect 
of it can be described or represented at dif- 
ferent levels of grain. As one learns more 
about something, one often comes to under- 
stand it at a finer grain. For example, learn- 
ing how the human circulatory system works 
involves learning the components of the sys- 
tem, such as the heart, the lungs, blood, and 
blood vessels, and the relation that the con- 
traction of the heart sends blood to different 
parts of the body. 

Given this level of representation, one 
can then ask, how does the heart contract? 
To answer this question, one would have to 
learn about the constituents of the heart: 
the properties of contractive muscle fibers, 
the role of ventricle pressure, and so on. 
The learner might push yet toward another 
level by asking how individual muscle fibers 
contract. At each level the system is un- 
derstood in terms of its constituent parts, 


each component into its constituent parts. 
This type of process expands the knowl- 
edge base, but in a particular way: It moves 
along part-of links (as opposed to kind-of 
links). In network terms, what was formerly 
a single node is expanded downward into an 
entire subtree. 

Miyake (1986) collected protocol data 
that illustrated this type of change. She 
showed that dyads, in attempting to under- 
stand how a sewing machine works, would 
move to lower and lower levels when they 
recognized that they had not understood 
the mechanism. For example, in figuring out 
how a stitch is made, one can understand it 
by explaining that the needle pushes a loop 
of the upper thread through the material 
to the underside so the upper thread loops 
entirely around the lower thread. However, 
to understand how this looping mechanism 
works, one has to explain the mechanism at 
a yet finer level — namely, in terms of how 
the bottom thread goes through the loop of 
the upper thread. 

Knowledge expansion via finer grain of 
representation is quite common in the sci- 
ences. The ultimate example is perhaps the 
reduction by chemists of material substances 
to molecules, described in terms of atoms, 
which in turn are re-represented by physi- 
cists in terms of elementary particles. We 
should keep in mind though that it is the ex- 
perts’ representations of these domains that 
are refined, and novices’ representations do 
not necessarily follow suit. 

In analyzing biological systems such as the 
circulatory system and machines such as 
the sewing machine, the parts are objects of 
the same kind as the system itself so they 
embody the part-of relations. In these exam- 
ples, valves and veins are of the same kind 
and are both parts of the cardiovascular sys- 
tem, and thread and a stitch are of the same 
kind and are both parts of the sewing process. 
The link between the behavior of the parts 
and the behavior of the whole can often be 
understood in terms of direct cause and ef- 
fect, or in terms of mechanical constraints 
that force movement in one direction 


COMPLEX DECLARATIVE LEARNING 3 83 


rather thandaesdaté bic httastHemtaliewergocormd processes that drive people to expand, 


the veins. 

However, there are systems in which the 
relation between the finer and coarser levels 
of analysis is not of the same kind and the 
behavior of the system is emergent (Chi, in 
press; Wilensky & Resnick, 1999). A traffic 
jam is an example. A traffic jam is a grid- 
lock of cars such that cars can no longer 
move at normal speed. However, the cars 
are not of the same kind as the traffic jam. 
In this kind of system, the (often) observ- 
able macrolevel behavior (the traffic jam) 
can be represented independently of the mi- 
crolevel objects (the moving cars). Each in- 
dividual car may be following the same sim- 
ple rule, which is to accelerate if there is no 
car in front within a certain distance and to 
slow down when there is another car within 
that distance. However, the jam itself can 
move backward even though the individual 
cars move forward. Thus, the behavior of the 
individual cars in a jam is independent of 
the jam. Nevertheless, the macrolevel pat- 
tern (the jam) arises from local interactions 
among the microlevel individual cars. 

Learning about systems of this kind does 
not necessarily proceed by unpacking parts 
into yet smaller parts but might more of- 
ten occur by acquiring the two represen- 
tations of the system separately and then 
linking them. This type of learning pro- 
cess re-represents the macro in terms of 
the relationship between the micro- and 
the macrolevels to explain the macrolevel 
phenomenon (Chi, in press; Chi & Haus- 
mann, 2003). 

It is not clear how often people are driven 
to expand their representations downward 
to a finer grain of analyses. In everyday life, 
people do not always feel the necessity to 
connect phenomena at one level to phenom- 
ena at more fine-grained levels. For exam- 
ple, people appear content to understand 
the weather at the level of wind, tempera- 
ture, clouds, humidity, rain, and snow, with- 
out re-representing them at the finer lev- 
els of molecular phenomena available to the 
professional meteorologist (Wilson & Keil, 
2000). We do not yet understand the factors 


but the possibility of such expansion is one 
important dimension of change in declara- 
tive knowledge. 


Greater Complexity 


A distinct type of change in the knowledge 
structure is needed when the learner’s cur- 
rent concepts are not sufficient to represent 
the phenomenon or system as a whole. The 
thing to be understood cannot be assimilated 
within any schema the learner has available. 
The learner can respond by creating a more 
complex schema (see Halford, Chap. 22). 
Although little is known about how more 
complex schemas are developed, one plau- 
sible hypothesis is that they are created 
by combining or assembling several exist- 
ing schemas (Ohlsson & Hemmerich, 1999; 
Ohlsson & Lehtinen, 1997). 

The creation of the theory of evolution 
by natural selection is a case in point. In the 
nineteenth century, many biologists knew 
that there were variations within species and 
that many species produce more offspring 
than survive into adult (reproductive) age, 
and the fact (as opposed to the explanation) 
of inheritance was of course commonly ac- 
cepted. The theory of evolution is the re- 
sult of assembling or combining these three 
schemas in a very particular way into a new, 
more complex schema. The change process 
here does not move along either kind-of or 
part-of relations, and it does not refine the 
grain of representation. Instead, it moves to 
greater complexity. The resulting schema is 
more complex than either of the prerequi- 
site schemas. Such a move does not neces- 
sarily require a higher level of abstraction 
(see the next section). The prior principles of 
intraspecies variation, inheritance, and dif- 
ferential survival were already abstract, and 
there is no significant increase in abstraction 
in the theory that combines them. 

The assembly process can be prompted. 
In one study, Ohlsson and Regan (2001) 
studied a laboratory version of the problem 
of the structure of DNA. Based on published 
historic accounts of the discovery of DNA, 
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we extractedletg 
cepts that had to be combined to represent 
the double-helix structure. These turned out 
to be concepts that most educated adults can 
be expected to possess (e.g., parallel, pair 
wise, inverse, complement). We found a lin- 
ear relationship between the proportion of 
these eight concepts that were primed by 
exercises prior to problem solving and the 
time it took undergraduate students to solve 
the laboratory version of the DNA problem. 

The assembly process can be understood 
as a combination of schemas. The key step 
in combining schemas must be to align the 
slots of one schema to those of another. 
A natural selection schema does not work 
unless the species that exhibits variation is 
also the species that is subject to selec- 
tive pressure. The assembly process might 
share features with conceptual combination, 
although the latter process refers to sin- 
gle lexical concepts consisting of unfamil- 
iar noun-noun or adjective-noun pairs, such 
as pet fish (Costello & Keane, 2000; Hamp- 
ton, 1997; Medin & Shoben, 1988; Smith, 
Osherson, Rips, & Keane, 1988; see Medin 
& Rips, Chap. 3). We know little about the 
frequency and prevalence of moves toward 
creating greater complexity at either the sin- 
gle concept or schema levels, and less about 
the conditions that prompt people to engage 
in such moves. 


Higher Level of Abstraction 


The concept of abstraction, in terms of 
where it comes from or how it is derived, 
continues to be controversial after two mil- 
lennia of scholarship. Besides the issue of 
how abstractions are formed, there is a 
second, frequently overlooked meaning of 
moving toward higher abstraction: Given a 
preexisting set of abstractions, it is possible 
to re-represent an object or a domain at a 
higher level of abstraction. For example, Chi, 
Feltovich, and Glaser (1981) showed that 
physicists represented routine physics prob- 
lems in terms of the deep principles that 
would be needed to construct a solution, 
whereas physics novices (those who have 
taken one course in college with an A grade) 
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cording to their concrete surface compo- 
nents, such as pulleys and inclined planes. 
The point is that one and the same prob- 
lem tends to be represented at these dif- 
ferent levels of abstraction by two groups 
both of whom know the relevant principles. The 
novices in the Chi et al. (1981) study knew 
the relevant principles in the sense that they 
could both state them and use them. How- 
ever, they did not spontaneously represent 
problems in terms of those principles instead 
of concrete properties. Somewhere along the 
path to expertise, the physicists came to do 
so (see Novick & Bassok, Chap. 14). 

Re-representing at a higher level of ab- 
straction (using already acquired abstrac- 
tions) is an interesting dimension of change, 
but relevant empirical studies are scarce. As 
is the case with most other types of changes, 
we lack knowledge of the conditions that 
prompt people to move along this dimension 
and the exact nature of the relevant cogni- 
tive mechanism. 


Shifted Vantage Point 


Changing the level of abstraction is closely 
related to, but different from, the process 
that we in normal parlance call change of 
perspective. A classic study by Anderson and 
Pichert (1978) demonstrates that this phrase 
does not merely refer to a metaphor but to 
a concrete psychological process. They gave 
subjects a text to read that described a home. 
They instructed subjects to take the perspec- 
tive of either a burglar or a prospective home 
buyer. The results showed that the instruc- 
tions led the subjects to remember different 
details, even when the perspective-taking in- 
structions were given after the subjects had 
read the text. 

Shifting one’s point of view can facilitate 
problem solving. For example, Hutchins and 
Levin (1981) used the occurrence of deictic 
verbs, such as “come,” “go,” “take,” “send,” 
and “bring,” and place adverbs, such as 
“here,” “there,” and “across,” in think-aloud 
protocols to determine the point of view of 
subjects solving the Missionaries and Can- 
nibals problem. They found that problem 
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problem. Initially, they view the river that 
the Missionaries and Cannibals have to cross 
from the left bank. Later in the problem- 
solving process, they view the river from 
the right bank. One of their most interest- 
ing findings was that when solvers were in 
an impasse after having two nonprogressive 
moves out of their current problem-solving 
state, they could resolve the impasse if they 
shifted their point of view. In short, the 
somewhat mysterious process of “taking” a 
particular perspective should not be under 
stood as purely metaphorical; this form of 
re-representation has real consequences for 
cognitive processing. 

In the cases discussed, the perspective 
shift was transient. There is some evidence 
to suggest that children become more able to 
shift perspective as they grow older (see Hal- 
ford, Chap. 22). For example, Shatz and Gel- 
man (1973) showed that young 2-year-olds 
could not adjust their speech to the age of 
the listener, whereas 4-year-olds did adjust 
their speech, depending on whether they 
were speaking to another peer or an adult. 
This suggests that older (but not younger) 
children are capable of shifting their per 
spectives to that of the listeners. Similarly, 
Piaget and Inhelder (1956) showed that 
older but not younger children are capable 
of understanding what another viewer might 
see, when the other person views it from an- 
other perspective. One might assume that as 
children mature they acquire more knowl- 
edge that enables them to shift perspective, 
and another study confirms this interpreta- 
tion because it manipulated knowledge di- 
rectly. We gave high school students oppor 
tunities to play with a computer simulation 
that allows them to take different roles in a 
business context, such as being the vice pres- 
ident of a bank. Students were much more 
able to take the perspective of the client after 
playing with the simulation, whereas they 
were only able to take the perspective of 
the bank before playing with the simulation 
(Jeong, Taylor, & Chi, 2000). 

In another series of studies, we attempted 
to teach first-grade children about the shape 
of the Earth (Johnson, Moher, Ohlsson, & 


son, & Leigh, 2001; Ohlsson, Moher, & John- 
son, 2000). Deep understanding of this topic 
requires that a person can coordinate the 
normal — we call it ego-centered — perspec- 
tive of a person walking around on the Earth 
with an exo-centered perspective from a hy- 
pothetical (and physically unattainable) van- 
tage point in space. Such perspective coordi- 
nations can be very complex. For example, 
consider sunsets. What in the ego-centered 
perspective appears as the sun disappear- 
ing behind the horizon appears in the exo- 
centered perspective as movement of the 
border between light and shadow across the 
surface of the Earth owing to the latter’s ro- 
tation. Clearly, the mapping between these 
two views of the event is far from natural, 
simple, or direct, and it requires consider- 
able learning and instruction to develop the 
exo-centered perspective and to link it to 
everyday perception. 

These and related studies demonstrate 
the occurrence of shifting vantage points and 
document the advantages they bring. This 
type of change must be an important dimen- 
sion of growth of declarative knowledge. 


Discussion 


We suggest that a complex body of declar- 
ative knowledge over time moves along 
multiple dimensions of change: size, con- 
nectedness, consistency, grain, complexity, 
abstraction, and vantage point. Undoubt- 
edly, there are other dimensions along which 
declarative knowledge also changes dur- 
ing learning, such as coherence, but each 
of these has at least some support in 
empirical studies. 

Although we separate these seven di- 
mensions analytically for purposes of this 
chapter, we do not suggest that a cognitive 
change typically moves along a single dimen- 
sion. Most complex knowledge acquisition 
processes will involve simultaneous move- 
ment along more than one dimension. For 
example, learning about chemistry involves 
thinking of material substances as solids, liq- 
uids, and gases, instead of, for example, iron, 
water, and air; this is a move toward higher 
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student acquires a finer-grained analysis of 
material substances in terms of atoms and 
molecules and a large number of previously 
unknown isolated facts about such sub- 
stances (e.g., their melting points). He or she 
might have to assemble a new schema such 
as dynamic equilibrium, which involves shift- 
ing the vantage point between the atomic 
level (where there are continuous processes) 
and the emergent macrolevel (where there 
is, nevertheless, stability). A year of high 
school chemistry is likely to require move- 
ment along all seven of these dimensions. We 
suggest that this is typical in the acquisition 
of complex declarative knowledge. 

Given that a representation can change 
in all the ways we have described previ- 
ously, research on the acquisition of complex 
declarative knowledge encounters a partic- 
ular difficulty — how to assess the effects 
of different learning scenarios and training 
procedures. The study of declarative knowl- 
edge contrasts in this respect with the study 
of procedural knowledge. Learning of pro- 
cedural knowledge such as problem solv- 
ing can be assessed relatively straightfor- 
wardly by measuring the degree to which 
a learner’s representation of the procedure 
approximates the correct solution procedure 
in terms of the rules and strategies. Learning 
of declarative knowledge, however, must be 
measured in light of the seven dimensions 
mentioned previously. This is perhaps the 
most important methodological problem in 
the study of complex declarative knowledge. 

Although we understand the character 
of these seven dimensions relatively well, 
we know little about what triggers people 
to move along one or the other dimension. 
What are the factors that trigger someone 
to move to a finer grain or to another level 
of abstraction? Under what conditions will a 
learner move to an alternative vantage point? 
Similarly, we do not fully understand the na- 
ture of the processes that bring about the 
changes in each dimension. Empirical re- 
search has been focused on documenting the 
psychological reality of each type of change 
and has not sufficiently pursued the ques- 


cesses of change. 

The seven types of changes discussed so 
far expand the learner’s prior knowledge 
base in a monotonic way in that the prior 
knowledge need not be rejected or over- 
written. It is possible to move toward larger 
size, denser connectedness, finer grain of 
representation, greater complexity, higher 
abstraction, and a different vantage point 
without rejecting or replacing one’s prior 
knowledge representation. The one excep- 
tion is a move toward increased consistency. 
To achieve increased consistency, one might 
have to reject or abandon some prior knowl- 
edge or belief. The next section discusses 
such nonmonotonic changes. 


The Learning Paradox: Monotonic 
and Nonmonotonic Change 


It is tempting to think of a novice as primar- 
ily lacking knowledge. The learning process 
is then naturally seen as a process of accre- 
tion — filling a void or adding information. 
Some of the types of changes described in 
the previous sections, such as increased con- 
nectedness and moves toward finer grain of 
representation, also have this cumulative na- 
ture because they significantly extend prior 
knowledge. However, several of the other 
types of changes, such as greater complex- 
ity, higher level of abstraction, and shifting 
vantage point, do not have this cumulative 
nature. Rather, they go further in that they 
re-represent the domain rather than merely 
add to it. However, in either the cumula- 
tive cases or the re-representation cases, the 
changes do not require that prior knowl- 
edge be rejected or replaced. For exam- 
ple, re-representing something at a higher 
level of abstraction does not require rejec- 
tion of the prior representation because ab- 
stract and concrete representations of the 
same thing are not mutually incompatible. 
We can switch back and forth between con- 
ceptualizing something as a hammer and as a 
tool without any need to make a permanent 


COMPLEX DECLARATIVE LEARNING 3 87 


choice bétnasantnechvoltiins Gas iiMiprargntortion is in the direction of the force, an ob- 


these types of re-representation process, the 
old and the new representation can coexist, 
as well as the re-representing of two compo- 
nent concepts or schemas into a more com- 
plex concept or schema via assembly. The 
representations for the original concepts re- 
main. In short, these types of cumulative and 
re-representational changes are monotonic. 

However, there are learning scenarios in 
which (1) the learner has a well-developed 
intuitive theory of the target domain, and 
(2) the subject matter to be acquired di- 
rectly contradicts one or more of the core 
principles or beliefs of that intuitive theory. 
Successful learning in scenarios with these 
properties requires that the learner go be- 
yond mutually compatible representations. 
The learner has to re-represent the domain 
in the more fundamental sense of abandon- 
ing or rejecting (i.e., stop believing) what he 
or she believed before, and replacing it with 
something else. We refer to this as nonmono- 
tonic change. 

Science education provides numerous ex- 
amples of prior conceptions that must be 
abandoned. Research on so-called miscon- 
ceptions has documented that people have 
complex and rich conceptions about do- 
mains in which they have not received ex- 
plicit instruction, but for which everyday 
experience provides raw material for intu- 
itive theory formation (Confrey, 1990). Re- 
search on such spontaneous science theories 
has focused on physics, chemistry, and biol- 
ogy, although social science and nonscience 
domains have also been investigated (Limon, 
2002). (The older social psychology work 
on belief systems focused primarily on in- 
tuitive theories of society and religion; see, 
e.g., Abelson et al, 1968; Rokeach, 1970.) 

Mechanics (forces and motion) is by 
far the most investigated domain. The 
dominant misconception in this domain is 
that motion implies force (Clement, 1982; 
DiSessa, 1983, 1988; Halloun & Hestenes, 
1985; McCloskey, 1983; Minstrel, 1982). 
Students assume that when an object is in 
motion, the motion is caused by a force be- 
ing applied to the object, the object’s mo- 


ject will move with constant velocity as long 
as it is under the influence of a constant 
force, and the velocity of an object is pro- 
portional to the magnitude of the applied 
force. When there is no force, an object will 
either slow down, if it is moving, or remain 
at rest. Motion is thus misconceived as being 
produced by force, as opposed to the more 
accurate view that motion is a natural (i.e., 
equilibrium) state that will continue indef- 
initely unless some force interferes with it. 
Students’ intuitive theory is more like the 
impetus theory held by Jean Buridan and 
other fourteenth-century thinkers (Robin & 
Ohlsson, 1989) than like the inertia principle 
that is central to the Newtonian theory. Mis- 
conceptions about other topics, such as bio- 
logical evolution, are also well documented 
(Bishop & Anderson, 1990; Brumby, 1984; 
Demasters, Settlage, & Good, 1995; Ferrari 
& Chi, 1998; Lawson & Thompson, 1988). 
The empirical findings not only show that 
novices possess well-developed misconcep- 
tions about many domains (Reiner, Slotta, 
Chi, & Resnick, 2000) but that these mis- 
conceptions persist in the face of instruction 
and other innovate kinds of intervention. 
For example, many science misconceptions 
in Newtonian mechanics are robust and re- 
main after instruction, even at very selec- 
tive academic institutions (DiSessa, 1982; 
Caramazza, McCloskey, & Green, 1980). 
With respect to mechanics, innovative in- 
structional interventions include using care- 
fully chosen analogies (Clement, Brown, & 
Zietsman, 1989; Driver, 1987), deliberately 
invoking cognitive conflict (Posner et al., 
1982), engaging in deliberate confrontation 
(Licht, 1987), or using a succession of in- 
creasingly sophisticated models (White & 
Frederiksen, 1990). Although it is difficult to 
evaluate the outcomes of such interventions, 
it appears that students at best acquire the 
scientific conception, perhaps in an encapsu- 
lated form, while maintaining their initial in- 
tuitive conception (Johsua & Dupin, 1987), 
which is not quite the intended outcome. 
There are at least three reasons (presented in 
the next section) why misconceptions are so 
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change often fails. 


Distortion via Assimilation 


As was mentioned earlier, in learning, new 
information is typically assimilated to exist- 
ing schemas. Thus, one reason that miscon- 
ceptions persist is that, when an instructor 
states the more veridical theory so it contra- 
dicts the learner’s prior misconceived knowl- 
edge, the new information is typically dis- 
torted in the process of being assimilated 
to the prior misconceived knowledge. To 
illustrate, consider a young child who be- 
lieves that the Earth is as flat as it looks 
to the unaided eye. What happens if he or 
she is told that the Earth is round? Nuss- 
baum (1979; 1985), Nussbaum and Novak 
(1976), Vosniadou (1994a, 1994b), and Vos- 
niadou and Brewer (1992) observed two 
intuitive schemas that we are tempted to 
interpret as consequences of distortion by 
assimilation. Some children draw the Earth 
as a flat entity with a circular periphery (like 
a pancake); others claim that the Earth is 
spherical but hollow and half-filled with dirt 
(thus providing a flat surface for people to 
walk on). In both cases, the Earth is both 
flat and round. Instruction to the effect that 
the Earth is round was thus assimilated to a 
prior flat-Earth conception without any sig- 
nificant changes in the latter. 


Evasion of Conflicts 


Distortion via assimilation is most plausible 
when the learner is unaware of the conflict 
between his or her prior knowledge and new 
information. The previous example involv- 
ing the shape of the Earth illustrates this 
well; the young child is not aware that he 
or she is interpreting the adjective “round” 
in a different way than that intended by the 
adult speaker. This type of distortion can be 
reliably triggered in the laboratory by de- 
liberately creating texts that violate a nor- 
mal reader’s worldview (Graesser, Kassleer, 
Kreuz, & Mclain-Allen, 1998). 

However, even if the conflict between 
prior knowledge and new information is de- 
tected, it does not necessarily trigger pro- 


gists (Abelson et al., 1968) and cognitive re- 
searchers (Chinn & Brewer, 1993; Darden, 
1992) have converged on very similar lists 
of potential modes of response to inconsis- 
tency. They agree that inconsistency often 
triggers evasive maneuvers that dismiss the 
inconsistency in some other way than by 
revising the relevant knowledge. The most 
basic mode of response is abeyance, that is, 
to postpone dealing with a contradiction on 
the grounds that not enough information is 
available to decide what, if anything, fol- 
lows. One step removed from doing nothing 
is bolstering: The person who encounters in- 
formation that contradicts some concept or 
belief X hastens to seek out supporting or 
confirming evidence that supports X. Fes- 
tinger (1962/1957) and others hypothesized 
that the need to reduce an inconsistency is 
proportional to the ratio of supporting to 
contradicting pieces of information. Thus, 
by drowning the contradicting piece of in- 
formation in a flood of confirming ones, it 
is possible to lower the need to resolve the 
contradiction and hence to keep going with- 
out altering one’s knowledge. Another pro- 
cess with a similar outcome is recalibration, 
that is, to lower the importance one attaches 
to the conflicting thoughts, thus making the 
conflict itself less important and easier to ig- 
nore. (A student might decide that he or 
she is not interested in science after all, so it 
does not matter what they teach in science 
courses.) These processes constitute evasive 
modes of response to inconsistent informa- 
tion, but they are not learning processes be- 
cause there is no constructive change in the 
person’s knowledge. 


Lack of Computational Power 


In describing the seven dimensions of 
changes, we sometimes speculated on the 
processes of change. What would happen if 
the inconsistent information triggered one or 
more of the learning processes that we pro- 
posed in previous sections? Take the process 
of creating greater complexity via assembly 
as example. In that process, a more complex 
representation is created by combining two 
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ful whether this process could lead to a new, 
more veridical theory. Each of the assembled 
representations will presumably be consis- 
tent with the learner’s prior intuitive the- 
ory, so they will lack veridicality. One cannot 
combine two nonveridical representations to 
create a third, veridical representation. For 
example, learners’ naive conception of heat 
and temperature, when combined, do not 
add up to the correct scientific conception 
of heat (Wiser & Carey, 1983), nor can tele- 
ological and Lamarckian ideas combine to 
form the principle of natural selection. 

Although we do not spell out each argu- 
ment here, a similar case could be made re- 
garding the processes responsible for each of 
the seven types of changes discussed in the 
previous section. None of them has the com- 
putational power to create anew conception 
that goes beyond its own conceptual inputs 
because, by definition, they are nonmono- 
tonic changes. 

In summary, the mere presence of contra- 
dictory information is not sufficient to trig- 
ger productive cognitive change of the non- 
monotonic kind. A conflict between prior 
knowledge and new information might go 
undetected, in which case the learner might 
blithely assimilate the new information to 
prior knowledge, probably distorting it in the 
process. Even if the learner detects the con- 
flict, he or she might hold the new infor- 
mation in abeyance rather than respond to 
it. If he or she feels a need to deal with the 
contradiction, there is a repertoire of evasive 
maneuvers, including bolstering and recali- 
bration of subjective importance, that will 
make the contradiction less disturbing with- 
out any revisions in prior knowledge. Finally, 
the productive learning processes discussed 
previously do not have the computational 
power to create a new conception that goes 
beyond the conceptual inputs to those pro- 
cesses. The prevalence of these three kinds 
of responses to encounters with contradic- 
tory information — distortion via assimila- 
tion, evading conflicts, and lacking computa- 
tional power — raises the question of how an 
intuitive theory can ever be replaced. That 
is, how can a truly new theory or idea that is 


be acquired? Bereiter (1985) referred to this 
as the learning paradox. 


Conclusions and Future Directions 


Despite the prevalence of distortion via as- 
similation to prior knowledge, evasion of 
conflicts, and lack of computational power, 
nonmonotonic change does happen. 

Children do replace their childhood con- 
ceptions with adult ones, some physics stu- 
dents do succeed in learning Newtonian me- 
chanics, and scientists do sometimes replace 
even their most fundamental theories in the 
face of anomalous data. Thus, there must be 
cognitive mechanisms and processes that can 
overcome the learning paradox. A theory of 
complex learning should explain both why 
nonmonotonic change has such a low prob- 
ability of occurring, and how, by what pro- 
cesses, it happens when it does happen. 

The study of such noncumulative learn- 
ing processes is as yet in its infancy. In this 
section, we offer a small number of specu- 
lative proposals about how nonmonotonic 
learning processes can occur. These brief 
proposals are intended to serve as inspiration 
for further research. 


Pathways to Nonmonotonic Change? 


We describe below four mechanisms, along 
with some empirical support. We then con- 
sider whether each of them can potentially 
achieve nonmonotonic change. 


TRANSFORMATION VIA BOOTSTRAPPING 


One hypothetical path to a new theory is to 
edit or revise one’s existing theory piece by 
piece until the theory says something signif- 
icantly different from what it said originally. 
We can conceptualize such a bootstrapping 
process as a series of local repairs of a knowl- 
edge structure. Local repairs require simple 
mechanisms such as adding links, deleting 
links, reattaching links, and so forth. The 
critical condition for local repairs is that the 
student recognize that the repairs are needed 
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or her existing knowledge and new knowl- 
edge. We have some evidence that the accu- 
mulation of local repairs can lead to a sig- 
nificant transformation of a person’s men- 
tal model of the circulatory system from 
a flawed single-loop model to the correct 
double-loop model (Chi, 2000). 

As a second example of bootstrapping, 
Thagard (i992a) analyzed the changes in 
the French chemist Lavoiser’s conception of 
matter during the critical years of the devel- 
opment of the oxygen theory of combustion. 
Thagard shows how Lavoiser’s conception 
of combustion can be modeled by a seman- 
tic network, and how that network is grad- 
ually transformed over several years as the 
scientist is reflecting on the outcomes of em- 
pirical experiments. By adding and deleting 
nodes and redrawing links, Thagard depicts 
Lavoisier’s knowledge network as undergo- 
ing a gradual transformation such that its ini- 
tial state represents the phlogiston theory of 
combustion, but its final state represents the 
oxygen theory. 

How much can transformation via local 
repairs explain? There are multiple expla- 
nations for why local repairs succeed in the 
case of the circulatory system. One reason 
is that the transformation from a single-loop 
model to a double-loop model crosses no on- 
tological categories (Chi & Roscoe, 2002). 
Another reason might be the relative lack 
of “depth” of this domain in the sense that it 
cannot be represented by a center-periphery 
structure. The single-loop principle does not 
deductively imply the other relevant facts 
about the circulatory system in the manner 
in which Newton’s three laws of motion im- 
ply more peripheral statements within the 
domain of motion. The looser connection 
between center and periphery might make 
the single-loop principle easy to tinker with. 
Finally, there is a question of commitment 
(Ohlsson, 1999). Although students believe 
that there is a single circulatory loop, this is 
not one of their most cherished beliefs and 
they probably do not experience it as im- 
portant to their worldview. Tinkering even 
with the core principle of this domain might 
therefore come easier than in domains with 


deeper commitment to the core principles. 
Rokeach (1970) presented evidence from 
other than scientific domains that knowl- 
edge elements are more resistant to change 
the more central they are. It is plausible that 
transformation via bootstrapping a sequence 
of local repairs is less applicable the “deeper” 
the domain, at least as long as the change has 
to encompass the core principles to be com- 
plete. So perhaps this bootstrapping process 
cannot be considered a true nonmonotonic 
change mechanism. 


REPLACEMENT 


If stepwise revisions can only go so far to 
explain nonmonotonic change, what alter- 
native is there? Knowledge structures can 
be replaced. That is, an alternative represen- 
tation of a domain is constructed in paral- 
lel with a prior one through processes that 
do not use the prior one as input. The old 
and the new representations then compete 
for the control of discourse and behavior in 
the course of question answering, explana- 
tion, reasoning, and problem solving. The 
new, presumably more veridical representa- 
tion frequently wins, and the old one even- 
tually fades from disuse. 


Bottom-up Replacement. Replacement can 
proceed either bottom-up or top-down. 
First, consider a new representation built 
bottom-up. This might occur when the 
new knowledge is encountered in a context 
that does not necessarily evoke the conflict- 
ing prior knowledge. For example, students 
might experience science instruction as so 
distant from everyday experience that they 
build representations of what is taught in 
class that are independent from, and un- 
connected to, the former. The outcome of 
such encapsulated knowledge is an ability to 
solve textbook problems without enriched 
understanding of relevant phenomena en- 
countered in other contexts (everyday ex- 
perience, news reports, etc.). Owing to the 
compartmentalization of contexts, the con- 
flict between the prior intuitive theory and 
the new theory is not salient to the learner, 
and the construction of the new theory can 
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knowledge. 

If matters remain in this state, it is doubt- 
ful whether this can be considered successful 
nonmonotonic learning. The crucial ques- 
tion is whether the new theory, once con- 
structed, can migrate into and usurp the 
territory of the prior intuitive conception. 
Successful nonmonotonic learning requires 
that a phenomenon previously understood 
within the intuitive theory begin to be un- 
derstood within the new theory instead. 


Top-Down Replacement.Consider the po— 
ssibility of top-down generation of a new 
knowledge structure. An abstract schema 
might be acquired in an alternative domain 
and transferred wholesale to a new domain. 
An example of this hypothetical process is 
provided by more recent attempts to under 
stand the operation of the immune system in 
Darwinian terms. Philosophers and theoret- 
ical biologists have attempted to formalize 
Darwin’s theory of evolution (Thompson, 
1989), and the resulting abstract schema has 
been applied to the question of how the im- 
mune system could produce antibodies for 
a wide variety of antigens. The Darwinian 
answer is that the immune system continu- 
ally generates more or less random antibod- 
ies. High fit between antibodies and antigens 
triggers increased production of the former; 
thus, the antigens themselves function as an 
environment that selects for the antibodies 
that fight them (Gazzaniga, 1992). The ac- 
curacy of this theory of the immune system 
is not the issue here. It is an example of a 
process in which a complex abstract schema 
was transferred as a whole to provide a cog- 
nitive template for a novel theory of a phys- 
iological process far removed from the evo- 
lutionary processes of speciation and adap- 
tation for which the schema was originally 
constructed. 

This top-down process is limited in that it 
relies on the prior existence of an appropri- 
ate abstract schema, which raises the ques- 
tion of where abstractions originate. This 
issue has remained controversial for more 
than two millennia. The standard sugges- 
tions include induction over exemplars (see 


tion (see Greenfield, Chap. 27). Because the 
topic of abstraction is discussed elsewhere in 
this volume, we do not intend to answer this 
question here. 

Side-stepping the issue of where an ab- 
stract schema comes from in the first place, 
we first need to know whether top-down re- 
placement is possible, given that an abstract 
schema exists. To test the feasibility of this 
top-down replacement process, we are in- 
structing students about a domain-general 
abstract schema that might serve as a tem- 
plate for understanding multiple concepts in 
many domain. One example is the schema 
of emergence (Chi, in press), which has ap- 
plications in biology, chemistry, and physics. 
It is plausible that direct instruction of this 
sort results in the de novo construction of an 
alternative conception, as opposed to grad- 
ual transformation of a prior conception. 


TRANSFER VIA ANALOGY 


Existence of an abstract schema may not bea 
necessary requisite for the top-down process 
to work. A concrete schema from another 
domain might serve as template if the two 
domains are easy enough to align that the 
transfer process can operate via analogy (see 
Holyoak, Chap. 6). In this hypothetical pro- 
cess, the learner acquires a schema in some 
source domain S; later, he or she is learning 
about some target domain T for which he or 
she already has an intuitive theory. The new 
information about T contradicts his or her 
prior intuitive theory about T but is analo- 
gous to what is known about S. If the learner 
creates a new representation for T based on 
what is known about S instead of building di- 
rectly on his or her current intuitive theory 
of T, then he or she might avoid distortion 
by assimilation. 

We tested the reality of this transfer of 
concrete schema process in a virtual reality- 
based scenario for teaching children that 
the Earth is round (Johnson et al., 1999, 
2001; Ohlsson et al., 2000). We created a 
virtual planet that was small enough so the 
consequences of sphericality were immedi- 
ately perceivable. For example, even minor 
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objects visibly “appear” or “disappear” over 
the horizon. Having acquired a notion of 
living on a spherical planet in the context 
of this fictional asteroid (about which the 
children were not expected to have any dis- 
torting prior views), we then supported, via 
a one-on-one dialogue, the analogical trans- 
fer of that schema to the context of the 
Earth. Pre- to posttest comparisons between 
the treatment and a dialogue-only control 
group showed that the effect of prior learn- 
ing in the virtual environment was positive 
(albeit small in magnitude). We infer that 
the schema for the virtual asteroid to some 
extent served as template for the new con- 
ception of the Earth that we tried to teach 
them. Hence, the learning paradox was over- 
come by stimulating the children to build a 
representation of what life on a sphere is like 
independent of their prior knowledge of the 
Earth, and then encouraging the use of that 
representation as a template for building a 
new representation of the Earth. 


ONTOLOGICAL SHIFT 


Ontological categories refer to a set of cate- 
gories to which people partition the world 
in terms of its most fundamental features 
(as opposed to characteristic and defining 
features; Chi, 1997). For example, two high- 
level categories that people are likely to 
partition the different types of entities in 
the world into are substances and processes. 
Each type of entity is conceptualized as 
having certain fundamental properties. For 
example, substances such as sand can be 
contained in a box, but processes such as 
a baseball game, cannot; however, processes 
can last for 2 hours, but substances can- 
not. Misconceptions are miscategorizations 
of entities into wrong ontological categories. 
For example, students typically misconceive 
heat or electricity as a substance that can 
move from one location to another (Chi, 
Slotta, & de Leeuw, 1994). Continued study 
of some entity that is initially believed as be- 
longing to category X might reveal proper- 
ties that are not consistent with its ontolog- 


ing requires that the learner re-represent 
the entity as belonging to another ontolog- 
ical category, such as from a kind of sub- 
stance to a kind of process (Slotta, Chi, & 
Joram, 1995). 

This kind of ontological shift replaces a 
prior conception with a new conception in 
terms of an entity’s ontological status. Thus, 
this process of ontological shift may qualify 
as a kind of a nonmonotonic mechanism. 


Toward a Theory of Learning 


In 1965, Robert M. Gagné published The 
Conditions of Learning, which summarized 
what was known about learning at the time. 
His approach was the unusual one of assum- 
ing that there are multiple, distinct types of 
learning processes distinguishable with re- 
spect to their prerequisites, processes, and 
results. He presented these in order of in- 
creasing complexity, beginning with “sig- 
nal learning” (simple conditioning) and end- 
ing with “problem solving” (Gagné, 1965). 
The most noteworthy feature of his ap- 
proach is signaled by the book’s title: For 
each type of learning, Gagné asked un- 
der which conditions that type of learning 
might occur. 

In our efforts to summarize what is 
known about the acquisition of complex 
declarative knowledge, we, too, have been 
led to present a list of different types of learn- 
ing. In the realm of monotonic learning, we 
distinguish between seven different dimen- 
sions of change: size, connectedness, con- 
sistency, grain, complexity, abstraction, and 
vantage point. In the realm of nonmonotonic 
change, we have specified numerous non- 
learning modes of response to contradictory 
information such as assimilation and evasive 
processes of abeyance, bolstering, recalibra- 
tion, and explained why many of the learn- 
ing mechanisms cannot in principle pro- 
duce true nonmonotonic learning. Finally, 
even our proposals with respect to non- 
monotonic learning break down into multi- 
ple processes such as transformation via lo- 
cal repairs, bottom-up compartmentalized 


COMPLEX DECLARATIVE LEARNING 393 


replaceniendseapodisahy: RpSciaaebnaECOfor Interdisciplinary Research on Con- 


the help of abstract schemas, transfer of con- 
crete schema via analogies, and ontologi- 
cal shift. It seems likely that, as the study 
of complex learning progresses, cognitive 
scientists will further our understanding of 
these replacement processes. 

However, as Gagné clearly saw 40 years 
ago, a list of learning processes is by itself an 
incomplete theory of learning. One would 
expect such a theory to support explana- 
tion of learning outcomes, to allow us to 
say why one subject matter is more diff- 
cult to acquire than another, to predict the 
success rate of particular instructional sce- 
narios, and so on. However, to accomplish 
these and other theoretical tasks, we need 
to know when, under which circumstances, 
one or the other learning process is likely 
to occur. A predictive science of complex 
learning requires that we can specify the 
when and wherefore of the many process hy- 
potheses that spring from the imagination of 
the cognitive scientist. Nowhere is this more 
obvious than in the case of nonmonotonic 
learning. This, we suggest, is the research 
front in the study of complex declarative 
learning. 


Note 


1. In social cognition research, intuitive theories 
are called belief systems (Fishbein & Ajzen, 
1975; Rokeach, 1960, 1970). Although the two 
constructs of intuitive theory and belief sys- 
tem are essentially identical, this connection 
between social and cognitive psychology has 
been overlooked on both sides (but see Schultz 
& Lepper, 1996). 
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CHAPTER 17 


Thinking as a Production System 


Marsha C. Lovett 
John R. Anderson 


Thinking as a Production System 


Since their birth (ca. late 1960s), produc- 
tion systems have been developed as a formal 
tool not only for describing but for explain- 
ing how humans think. Indeed, “to advance 
our understanding of how humans think” is 
the stated goal of Newell and Simon’s clas- 
sic book, Human Problem Solving (1972), in 
which the first body of work on production- 
system models of human thought was pre- 
sented (see Novick & Bassok, Chap. 14). The 
main goal for production systems in psy- 
chological research has changed little in the 
intervening years, and yet the state of the 
art has advanced dramatically. The aim of 
this chapter is to present a contemporary 
production-systems approach to open ques- 
tions in problem solving, reasoning, anal- 
ogy, and language. We highlight the ways in 
which today’s production systems allow for 
more flexibility, stochasticity, and sensitiv- 
ity than their predecessors. Besides demon- 
strating that production systems can offer 
insight into current questions and add to 
our understanding of human thinking, we 


discuss our view of production systems in 
future research. 


Background on Production Systems 


A production system is a set of production 
rules — each of which represents a contin- 
gency for action — and a set of mechanisms 
for matching and applying production rules. 
Because the production rule is the funda- 
mental unit of this formalism, it is worth giv- 
ing a few examples. Table 17.1 presents four 
sample production rules written in English. 
Note that each is divided into two parts by 
the word “then”: The first part of each pro- 
duction rule (before the “then”) specifies the 
conditions under which that production rule 
is applicable, and the second part specifies 
the actions to be applied. Conditions may re- 
flect an aspect of the external world (e.g., it 
is dark) or an internal, mental state (e.g., my 
current goal is to reach a particular location, 
or I can retrieve a particular fact). Likewise, 
actions may transform a feature in the real 
world (e.g., flip the light switch) or an in- 
ternal, mental state (e.g., change my current 
goal, or add a fact to memory). 
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in English 


Number 


Specification of Production Rule 


1 


When my current goal involves navigating in a 
dark room, 


then I flip the light switch in that room. 


When my current goal is to go to a location that 
is more than 300 miles away, 


then I set a subgoal to go to the local airport. 


When my current goal is to answer an arithmetic 
problem of the form D1 + D2, 


then I change the goal to try retrieving the sum 
of Di and D2 from memory. 


When my current goal is to answer an arithmetic 
problem of the form D1 + D2, 


then I hold up Dz fingers and change the goal to 
count them starting with the number after D1. 


To operate, a production system requires 
a dynamic memory that represents the cur 
rent state of the system and is used to 
match against production rules’ conditions. 
For example, when dynamic memory in- 
cludes the goal “to get to San Francisco,” the 
second production rule in Table 17.1 would 
match for someone in Pittsburgh, Pennsyl- 
vania. This pattern matching of production 
rules to dynamic memory leads to a set of po- 
tentially applicable production rules called 
the conflict set. However, not all production 
rules in the conflict set are applied. The pro- 
cess of conflict resolution specifies which pro- 
duction rules from the conflict set will ex- 
ecute their actions or fire. These actions are 
likely to change the external and/or internal 
state of the system reflected in a change to 
dynamic memory. Then, a potentially differ- 
ent set of production rules may comprise the 
conflict set, and the cycle continues. 

One way to view how production rules 
operate is by analogy to stimulus—response 
associations; that is, when a particular stim- 
ulus is present, an associated response is trig- 
gered. This fits with the notion that a pro- 
duction rule cannot be directly verbalized 
but rather is observable through behavior. 
This analogy to stimulus—response associa- 
tions emphasizes the fact that production 
systems do not operate via a homunculus 


interpreting production rules as program- 
ming code. Instead, each production rule — 
when it matches dynamic memory — has 
the potential to fire and change the current 
state, thus setting other production rules 
into action. 

This discussion leads to the question of 
what it means to model thinking as a produc- 
tion system: What are the theoretical impli- 
cations associated with representing knowl- 
edge as production rules? The following 
are four features commonly attributed to 
production-rule representations: 


1. Production rules are modular. Each pro- 
duction rule represents a well-circum- 
scribed unit of knowledge such that any 
production rule can be added, refined, 
or deleted independently of other pro- 
duction rules in the system. Moreover, 
each production rule is atomic such that 
it would be added, refined, and deleted 
as whole unit. It is important to note, 
however, that this modularity does not 
preclude production rules from inter- 
acting with each other extensively in a 
running system. Indeed, adding a new 
production rule to an existing set can — 
and often does — completely change the 
functioning of the system because of the 
way production rules’ actions impact each 
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modelers (Klahr & Wallace, 1976; Young 
& O'Shea, 1981) took advantage of this 
feature by adding or deleting produc- 
tion rules to explicitly test how that 
change would impact the system’s be- 
havior. More recently, production systems 
have been developed with autonomous 
learning mechanisms that enable the sys- 
tem’s production rules to change based 
on experience. In these systems, mod- 
ularity is achieved because these learn- 
ing mechanisms create and modify indi- 
vidual production rules independently of 
other rules. 


. Production rules are asymmetric. Each pro- 
duction rule is a unidirectional contin- 
gency for action. This means that the pro- 
duction rule “When I want to type the 
letter ‘j’, then I punch my right index fin- 
ger” is different from “When I punch my 
right index finger, then I type the letter 
‘j’”. Moreover, asymmetry and modularity 
imply that, if these two production rules 
were in the same system, adding, delet- 
ing, or refining the former would not di- 
rectly change the latter. That is, practicing 
typing would exercise the first produc- 
tion rule, strengthening the index-finger 
response when “j” is the desired letter, but 
it would not strengthen one’s knowledge 
that “j” appears when touch-typing with 
that finger. For expert touch-typists, this 
asymmetry is quite noticeable: Without 
looking at a keyboard, try to identify the 
letter that is typed with your left index 
finger. Tough, isn’t it? Typing the word 
“frog” would have been easier. Such asym- 
metry has been documented in many con- 
texts (see Singley & Anderson, 1989, for 
a review). 


. Production rules can be abstract. Produc- 
tion rules allow for generalization be- 
cause their conditions may be repre- 
sented as templates that match to a wide 
range of patterns. These conditions spec- 
ify the relationship(s) between items 
without specifying the items themselves 
(e.g., “When A is taller than B and B is 
taller than C, then say A is taller than C” 
is true for any values of A, B, and C). 


lationships allows for transfer of learning 
across different situations as long as they 
fit within the conditions of the given pro- 
duction rule. For example, the first pro- 
duction rule in Table 17.1 could match to 
a dark dining room, living room, or office, 
meaning that experience at flipping the 
light switch in any of these rooms would 
transfer to the others. Likewise, the third 
production rule in Table 17.1 could match 
to any two-addend addition problem. 


4. Production rules cannot be directly verbal- 


ized. This feature is based on the notion 
that each production rule represents 
knowledge about a contingency for action 
that is not directly accessible to verbal- 
ization. A good example of this occurs 
when someone knows how to drive a stan- 
dard transmission car but cannot explain 
it verbally. It is important to note that, 
while this feature implies that knowledge 
represented in production-rule form can- 
not be accessed directly, it does not im- 
ply that one cannot use other techniques 
to talk about performance knowledge. For 
example, when changing gears in a stan- 
dard transmission car, it is possible to ob- 
serve one’s own performance and verbally 
describe these observations. Also, knowl- 
edge about how to perform a task may be 
represented in multiple forms — some that 
can be verbalized and some that cannot. 


This last point confronts a common miscon- 
ception about production systems — namely, 
that knowledge about rules or procedures is 
necessarily represented as production rules. 
Whereas knowledge about rules and pro- 
cedures can be represented in production- 
rule form, it is not the content of knowl- 
edge that determines how it is represented. 
Instead, the four features listed previously 
serve as a set of necessary conditions for 
knowledge to be considered as being repre- 
sented in production-rule form. To illustrate 
the distinction between knowledge con- 
tents and representational form, Table 17.2 
shows that the same knowledge content 
(either column) can be represented in a 
production-rule form (top entry) or not 
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Knowledge Contents 


Representational Form Rulelike 


Factlike 


Production rule 


make that move 
Declarative fact 


When I want to type a letter and I 
know its finger move, then 


To touch-type, one must make the 
finger move corresponding to 
the currently desired letter 


When I want to type the letter “j,” 
then I punch with my right 
index finger on the home row 

The letter “j” goes with the right 
index finger in home-row 
position 


(bottom entry as a declarative fact). So, 
when considering what it means for knowl- 
edge to be represented in production-rule 
form, the key is not in what knowledge is 
being represented but rather in how. 


Production Systems, Then and Now 


The first production systems set out to es- 
tablish a mechanistic account of how hu- 
man adults perform relatively short, mod- 
erately difficult, symbolic tasks (Newell & 
Simon, 1972). Besides demonstrating that 
production systems could solve these tasks, 
the main goal was to connect the sys- 
tem’s processing steps to human problem- 
solving steps. Several features distinguish 
these early production systems from their 
current-day progeny. First, early production 
systems tended to focus on demonstrating 
human-like performance; current models rely 
heavily on learning mechanisms to derive 
predictions about learning and performance 
across time. Second, early models focused 
on reproducing qualitatively the processing 
steps of individual problem solvers, whereas 
more recent models have been submitted to 
both quantitative analyses of fit to aggre- 
gate data (e.g., average reaction times for 
various conditions) and qualitative analy- 
ses (e.g., whether the model demonstrates 
the same errors as people).! Third, the role 
of noise processes has increased drastically 
from early models that avoided stochastic 
processes completely to current day mod- 
els in which stochasticity plays an important 
role (Lebiere, Anderson, & Bothell, 2002; 
Lebiere et al., 2003). Fourth, early models 
focused on the “cognitive” layer of process- 
ing and eschewed integrating receptors and 


effectors into models. In contrast, current 
production systems incorporate and empha- 
size perception and action in their frame- 
works (Anderson & Lebiere, 1998; Meyer 
& Kieras, 1997). Finally, the fifth feature 
that distinguishes early and recent produc- 
tion systems is so strongly linked to the 
early models that it has sometimes been 
considered a defining feature of produc- 
tion systems. This is the symbolic nature of 
early production systems. However, almost 
all modern production systems take a hybrid 
view by positing symbolic representations 
as important conceptual units and acknowl- 
edging graded representations as a valuable 
additional layer (e.g., associating continu- 
ously valued quantities with each produc- 
tion rule). 


Current Production Systems in Context 


This section provides a brief overview of four 
production systems currently being used in 
a variety of cognitive modeling situations. 
The systems to be described are ACT-R 
(Anderson & Lebiere, 1998), EPIC (Meyer & 
Kieras, 1997), Soar (Laird, Newell, & Rosen- 
bloom, 1991), and 4-CAPS (Just, Carpenter, 
& Varma, 1999). ACT-R emphasizes the 
notion of a cognitive modeling architecture 
in which the same set of mechanisms and 
representational schemes are used to cap- 
ture human learning and performance across 
tasks. Recently, this has been extended to 
map various ACT-R mechanisms and mod- 
ules to particular brain regions for compar- 
ison with neuroimaging data. EPIC has fo- 
cused on capturing the connections among 
the cognitive, perceptual, and motor sys- 
tems. Recently, EPIC has been used to make 
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to-action loops in multiple-task situations 
and across the adult age span. Soar was origi- 
nally developed to address issues in both psy- 
chology and artificial intelligence. Recently, 
it has been particularly successful in simu- 
lating multiagent, dynamic interactions with 
real world application (e.g., Jones, et al., 
1999). The 4-CAPS architecture, like its pre- 
decessor 3-CAPS (Just & Carpenter, 1992), 
focuses on individual differences. 

To delineate the space of current produc- 
tion systems, we next highlight the dimen- 
sions along which these systems differ. First, 
they differ with regard to their degree of pro- 
cessing parallelism. Toward one end of the 
spectrum, ACT-R posits that only a single 
production rule can fire at a time. However, 
ACT-R allows for parallelism in other ways: 
asynchronous parallelism among its percep- 
tual and motor modules’, parallel retrieval of 
information from declarative memory, and 
parallel production-rule matching and selec- 
tion. Soar similarly posits serial processing 
in that a single operator is chosen in each 
decision phase, but this is preceded by an 
elaboration phase that allows parallel pro- 
duction firing. 4-CAPS allows parallel firing 
of production rules for all cycles, but this 
parallelism is subject to a capacity limitation 
such that the more production rules firing, 
the less rapidly each of them is executed. 
EPIC is the only system with fully parallel 
production-rule firing. To manage its mul- 
tiply threaded central cognition, EPIC uses 
task-coordination strategies that impose or- 
dering constraints when necessary. 

Another dimension along which the sys- 
tems differ is the degree of modularity they 
propose. Soar is at one end of this spectrum 
because of its unitary structure — a single set 
of production rules representing long-term 
memory. 4-CAPS posits a number of distinct 
sets of production rules connected to each 
other. In ACT-R and EPIC, multiple mod- 
ules correspond to separate perceptual and 
motor modalities and to “central cognition.” 
These modules are considered encapsulated, 
independent processors with their interac- 
tions handled by the production system. 

Although all four systems produce quan- 
titative predictions that match well to 


particularly focused on production-rule 
learning as well. Yet another dimension in 
which these architectures differ is their com- 
mitment to hybridization with Soar commit- 
ted to a purely symbolic account whereas 
ACT-R and 4-CAPS postulate continuously 
varying quantities that drive the processing 
of symbolic units. EPIC does have contin- 
uously varying parameters associated with 
various modules but does not appear to 
have information-laden nonsymbolic quan- 
tities in its theory of central cognition. 
Finally, production systems differ in the 
role that noise processes play in their pro- 
cessing. In Soar, their role is minimal (i.e., 
when a “tie” between production rules arises, 
one of them is chosen at random). In ACT- 
R and 4-CAPS, noise processes are assumed 
added to the various continuously varying 
computations that influence system perfor- 
mance. In EPIC, noise is used more to rep- 
resent variability in system parameters (e.g., 
rate parameter in Fitt’s law governing mo- 
tor movements) than to represent a generic 
nondeterminism of the system. 


Organization of the Remainder 
of the Chapter 


Our own research has involved the ACT- 
R system and slight variants. In this chap- 
ter, we describe six ACT-R models with 
which we are familiar that address differ- 
ent aspects of cognition. We do not fo- 
cus on the ACT-R details of these mod- 
els but rather on how they illustrate the 
general trends in production-system mod- 
els toward softer, more flexible, and highly 
detailed characterizations of human cogni- 
tion. We place each model in the multi- 
dimensional space described previously by 
highlighting the following features: Does 
the model include both performance and 
learning mechanisms? Does the model make 
use of symbolic (rule-based) and other 
continuously varying computations? Does 
the model draw upon multiple processing 
modules beyond a central production-rule 
memory? We use the template in Table 17.3 
to summarize how each model fits into this 
three-dimensional space in terms of its use 
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ACT-R Model 
Performance Mechanisms Learning Mechanisms 
Symbolic Subsymbolic Symbolic Subsymbolic 
Declarative Knowledge (usually Relative activation Adding new Changing activation 


chunks facts) that can be of declarative declarative of declarative 
directly chunks affects chunks to the set chunks and 
verbalized retrieval changing strength 
of links between 
chunks 
Production Knowledge for Relative utility of Adding new Changing utility of 
rules taking particular production rules production rules production rules 
actions in affects choice to the set 
particular 
situations 


of various ACT-R representations and mech- 
anisms. In addition, we comment on how 
each model makes use of parallelism and 
noise processes, as appropriate. 

We use the term “subsymbolic” to refer 
to the numerical values and computations 
associated with each symbolic unit. In this 
sense, the prefix “sub” refers to a level of de- 
scription below the symbolic units and that 
determines those symbolic units’ access in 
competition with other symbols. The use 
of the term subsymbolic from a connec- 
tionist perspective often refers to the fact 
that symbols may be represented in a dis- 
tributed fashion, with the prefix sub refer- 
ring to the pieces of the pattern that con- 
stitute a symbol. For instance, Smolensky 
(1988, p. 3) writes, “The name subsymbolic 
paradigm is intended to suggest cognitive 
descriptions built up of entities that corre- 
spond to constituents of the symbols used 
in the symbolic paradigm; these fine-grained 
constituents could be called subsymbols, and 
they are the activities of individual process- 
ing units in connectionist networks.” It is 
an interesting question whether these two 
views are really in contradiction. The sub- 
symbolic values discussed in this chapter are 
updated and used only locally, but at the 
same time have a global impact on the sys- 
tem’s processing, just as the activations of 
units in a connectionist system do. As an ex- 
ample of this, consider the utility values as- 
sociated with production rules: When multi- 


ple production rules match the current situ- 
ation, the one with the highest utility value 
succeeds in firing. This competition occurs 
among the individual units themselves with- 
out any explicit selection by a controlling ho- 
munculus and without any conscious access 
to the utility values. Another important kind 
of numerical quantity in our subsymbolic 
representation is similarities between sym- 
bols. With these quantities, a production rule 
can partially match against a symbol similar 
to the one specified in its condition, allowing 
the system to realize soft constraints. This 
fact further blurs the difference between the 
two senses of “subsymbolic.” Work exploring 
aconnectionist implementation of the ACT- 
R architecture (Lebiere & Anderson, 1993) 
suggests that symbolic units represented in 
a distributed fashion can yield the behavior 
of a symbolic system that has continuously 
valued quantities influencing the access and 
use of its symbols. 


Choice 


One of the perennial questions in problem- 
solving research involves how solvers make 
choices: choices of the next step, of an ap- 
propriate solution strategy, and of whether 
to use weak (domain-general) versus strong 
(domain-specific) methods. Indeed, around 
the time when production systems were first 
developed, Newell and Simon introduced 
the idea that the very process of problem 
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Figure 17.1. Initial state and three possible subsequent problem states from the 
Building Sticks Task. 


solving could be viewed as search in a prob- 
lem space, which equates problem solving 
with a series of choices. Research then ad- 
dressed the question, “How do solvers make 
choices?” by focusing on cases in which 
solvers have little or no domain knowledge. 
Production-rule models representing various 
problem-solving heuristics predicted perfor 
mance and established links between heuris- 
tics and human data. Current research asks, 
“How do solvers make choices?” but focuses 
on cases in which solvers have prior, relevant 
experience. This is, at its heart, a question 
about learning, so production systems that 
learn from their experience may offer addi- 
tional insight. 

In a set of studies by Lovett (Lovett 
& Anderson, 1996; Lovett, 1998), partici- 
pants’ choice learning in the Building Sticks 
Task (BST) was studied and modeled within 
ACT-R. The BST is an isomorph of the wa- 
ter jars task (Luchins, 1942) such that, in 
each problem, solvers must add and subtract 
the lengths of three building sticks to equal 
the length of a goal stick (see top of Fig- 
ure 17.1). Solvers face a choice between two 
strategies (see bottom row of Figure 17.1): 
overshoot, which involves starting with the 
longest building stick and shortening it by 
the others, and undershoot, which involves 
starting with the short or medium building 
stick and then lengthening it. In these stud- 
ies, participants encountered the BST for the 


first time and solved a sequence of problems 
in which the proportion of problems that 
could be solved by each of the two strategies 
was manipulated (e.g., 30% overshoot-only 
problems and 70% undershoot-only prob- 
lems or vice versa). The results can be sum- 
marized in three main findings: 


1. Participants’ choices initially followed a 
hill-climbing heuristic with little bias to- 
ward undershoot or overshoot. 

2. With experience, participants gradually 
learned to prefer the more successful 
strategy for their condition. 

3. Changes in strategy choice were sensitive 
to recent experiences in that participants 
were more likely to choose the strategy 
that had been successful on the previous 
(or even second-previous) problem. 


The model that was built for this task has 
since been applied in various forms to ac- 
count for choice learning in several other 
tasks (see Lovett, 1998). Here we describe 
the BST model specifically. The model was 
initially endowed with production rules that 
implement two domain-general heuristics, 
hill-climbing and guessing, for the partic- 
ulars of this task. For example, the guess— 
overshoot production rule makes the first 
overshoot move regardless of the details of 
the problem, and guess—undershoot does this 
for undershoot. These productions repre- 
sent an uninformed guess that their action 


408 THE CAMBRIDGE HANDBOOK OF THINKING AND REASONING 


will lead to Rredierhate dye: ints Hisésidianacyseaina (negated) time-weighted average of 


BST problem. In addition, the hillclimb- 
overshoot production makes the first over- 
shoot move but only matches when this 
move takes the initial state closest to the goal 
state; hillclimb-undershoot does the same 
for undershoot. These productions repre- 
sent knowledge for taking the action that 
looks best according to a hill-climbing met- 
ric. Note that three of these four produc- 
tion rules will match to each BST prob- 
lem’s initial state — both guess production 
rules and one hillclimb production rule, 
whichever matches the stick lengths of the 
problem (e.g., hillclimb—undershoot in Figure 
17.1). Note also that, although three pro- 
duction rules match to an initial stimulus, 
two of them produce the same response but 
on the basis of different knowledge (i.e., 
two separate production rules). This empha- 
sizes that production rules are not simply 
stimulus—-response associations but repre- 
sent additional information in their condi- 
tions, which defines the (potentially differ- 
ent) scopes over which they apply. 

Beyond the task-specific composition of 
its production rules, this model’s most 
important features come from ACT-R’s gen- 
eral, subsymbolic computations for pro- 
duction-rule utility values. Each production 
rule has an associated utility — learned by ex- 
perience — that represents a combined esti- 
mate of how successful and costly that pro- 
duction rule is likely to be. Whenever the 
model is faced with multiple matching pro- 
duction rules, there is a noisy selection pro- 
cess that fires the production rule with the 
highest subsymbolic utility value. This noise 
process serves to lead the model generally 
to choose the production rule that has been 
most useful in past experience, but to do so 
a proportion of the time consistent with that 
production rule’s utility relative to the com- 
peting production rules’ utility (e.g., com- 
peting production rules with very close util- 
ity values are selected virtually at random). 
These utility values are learned from experi- 
ence according to a prespecified mechanism: 
Specifically, each production rule’s utility is 
computed arithmetically as a time-weighted 
average of its past success rate combined 


its past costs, where cost is measured in time 
the production rule “spends” when fired. 

In the case of the BST model, learned 
utility values average in new experiences 
of success and failure across trials, allow- 
ing the model to gradually increase the util- 
ity value for production rules that have had 
greater success and lower cost, and hence 
to gradually increase the likelihood of fir- 
ing more useful production rules. Thus, the 
model shows the same gradual preference 
for the more successful BST strategy, as do 
participants. In addition, because this up- 
dating mechanism includes a time-weighted 
decay, the impact of recent successes and 
costs on a production rule’s overall utility 
value is greater, leading the model — like 
participants — to change strategy choice with 
greater sensitivity to recent experiences. 


Summary 


This production-system model of problem- 
solving choice specifies a set of fairly generic 
production rules to represent the heuris- 
tics of guessing and hill-climbing and then 
draws on ACT-R’s pre-existing production- 
rule mechanisms to learn to solve problems 
by experience. The major claim, then, is that 
strategy-choice learning is strongly guided 
by problem-solving experiences of success 
and cost associated with using those strate- 
gies and that strategies are effectively repre- 
sented as production rules. More specifically, 
the model posits that choices in problem 
solving are governed by an implicit competi- 
tion among production rules based on their 
utility values (a subsymbolic performance 
mechanism) and that these utilities are up- 
dated naturally based on experience (a sub- 
symbolic learning mechanism). The corre- 
sponding two subsymbolic, production-rule 
cells have checks in Table 17.4. Although 
this model does not address how produc- 
tion rules specific to this task are acquired 
(i.e, there is no symbolic production-rule 
learning), its initial production rule set is 
composed mainly of general heuristics that 
have been adapted only slightly to the con- 
text of the particular task. In other words, 


THINKING AS A PRODUCTION SYSTEM 409 


Tables pee hole dues (elt andelfechanisms Used in a Production-Systems 


Model of Choice 
Performance Mechanisms Learning Mechanisms 
Symbolic Subsymbolic Symbolic Subsymbolic 
Declarative chunks af 
Production rules J VA VA 


for this relatively knowledge-lean task, it is 
reasonable to suspect that participants and 
the model can manage without acquiring 
many new, task-specific production rules. It 
is also interesting that, in this task, produc- 
tion rules — with their somewhat broad con- 
ditions of applicability — largely determine 
the behavior of the system; although declara- 
tive knowledge is involved, it is not involved 
critically in the explanation of the phenom- 
ena. This representational bias is supported 
by the relatively broad, within-task transfer 
that problem solvers show in carrying over 
their strategic preferences from trained BST 
problems to novel BST problems. 


Analogy 


Analogy, the process of finding and using 
correspondences between concepts, plays 
a fundamental and ubiquitous role in hu- 
man cognition (see Holyoak, Chap. 6). From 
mathematical problem solving (Novick & 
Holyoak, 1991) to computer programming 
(Anderson & Thompson, 1989) to creative 
discovery (Holyoak & Thagard, 1995), anal- 
ogy facilitates better understanding of old 
knowledge and the formation and inference 
of new knowledge. The critical step in anal- 
ogy is finding a mapping from objects and re- 
lations in the source or known domain, where 
pre-existing knowledge forms the base of 
the analogy, to objects and relations in the 
target or novel domain, where knowledge 
from the source domain will be applied. Nu- 
merous researchers have proposed theories 
that describe how analogical mapping takes 
place (Gentner, 1983, 1989; Hofstadter & 
Mitchell, 1994; Holyoak & Thagard, 1989; 
Hummel & Holyoak, 1997; Keane, Ledge- 
way, & Duff, 1994; Kokinov, 1998). A com- 


mon feature of these theories is that they 
require a mixture of symbolic and subsym- 
bolic processes. The symbolic processes are 
required to reason about the structure of 
the domains, but the softness of subsymbolic 
processes is required to stretch the analogy 
in semantically plausible ways. 

Given the requirement of a mixture of 
symbolic and subsymbolic processes, mod- 
ern production systems would seem well 
designed to model analogy. Salvucci & An- 
derson (2001) describe a relatively success- 
ful application of the ACT-R theory to 
modeling results in the analogy literature. 
Before reviewing it, we would like to high- 
light the value added by such a theory. Al- 
though the model incorporates many of the 
insights of the other theories, it is not just 
a matter of implementing these theories in 
ACT-R. As a complete theory of cognition, 
the model contributes three factors lack- 
ing in these other models. First, it naturally 
maps these processes onto precise predic- 
tions about real world metrics of latency and 
correctness, rather than the more qualita- 
tive and ordinal predictions that have typ- 
ified other theories. Second, it integrates the 
process of analogy with the rest of cogni- 
tion and thus makes predictions about how 
processes such as eye movements are inter- 
leaved with the analogy process. Third, it 
shows that the mechanisms underlying anal- 
ogy are the same as the mechanisms under- 
lying other aspects of cognitive processing. 

Figure 17.2 illustrates the representation 
of the famous solar system analogy (Gentner, 
1983) in the Salvucci and Anderson system. 
Analogs are represented as higher-order 
structures built up of three components: 
objects, relations, and roles. The first two 
components, objects and relations (repre- 
sented as ovals in Figure 17.2) serve the same 
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Objects are the semantic primitives of the 
analogs, whereas relations link objects or re- 
lations together according to their function. 
The solar-system domain contains the two 
objects ss-sun and ss-planet, along with the 
three relations ss-causes, ss-attracts, and ss- 
revolves. Similarly, the atom domain con- 
tains the two objects at-nucleus and at- 
electron and the three relations at-causes, 
at-attracts, and at-revolves. The boxes in Fig- 
ure 17.2 represent the third component of an 
analog structure — roles, which serve to link 
objects and relations to form higher-order 
conceptual structures. Each role comprises 
five components: 


parent: a pointer to the parent relation 


parent-type: the semantic type of the par- 
ent relation 


slot: the relation slot that the object fills 
in the relation 


child: a pointer to the child object or re- 
lation 


child-type: the semantic type of the child 
object or relation. 


For example, in the case of the ss-attractor 
role, ss-attracts is the parent, attracts is the 
parent-type, attractor is the slot, ss-sun is the 
child, and sun is the child-type. 

Salvucci and Anderson (2001) describe a 
path-mapping process by which the struc- 
ture in the source is made to correspond to 
the structure in the analog. This mapping 
process is achieved by production rules that 
essentially walk through these graphs look- 
ing for correspondences. The critical step 
in this mapping is retrieving roles from the 
target domain to map onto roles in the 
source domain. This is achieved by the par- 
tial matching process in ACT-R that selects 
the most similar role. Similarity between the 
source and target role is determined based on 
the similarities among the parent-type, slot, 
and child-type components of the roles. One 
of the consequences is that the model can 
be misled to select inappropriate analogs on 
the basis of surface similarity between the 
components of a source and target. For in- 
stance, the model successfully simulated the 


confusions in probability problems based on 
surface similarities between examples. One 
limitation of the path-mapping process built 
into this model is that it only considers one 
proposition at at time. For that reason, the 
model cannot solve analogies that require 
the consideration of multiple propositions in 
parallel, whereas people and other models 
can (e.g., Hummel & Holyoak, 1997). 

On the other hand, the production sys- 
tem control structure leads to other predic- 
tions. Since the model goes from the source 
to the target, it has a preference for many-to- 
one mappings over one-to-many mappings. 
This enables the model to successfully pre- 
dict the results of Experiment 2 in Spellman 
and Holyoak (1996). They presented sub- 
jects with two stories involving countries on 
different planets and asked subjects to map 
countries on one planet to those on the other. 
The story relations can be summarized as 
follows: 


Story 1 Story 2 

richer (Afflu, richer (Grainwell, 
Barebrute) Hungerall) 

stronger stronger 
(Barebrute, (Millpower, 
Compak) Mightless) 


The relations include an ambiguous map- 
ping — namely, the mapping of Barebrute 
to either Hungerall or Millpower. Subjects 
were divided into two conditions: In the 1-2 
condition, subjects mapped objects from 
story 1 to those in story 2; in the 2-1 con- 
dition, subjects mapped objects from story 
2 to story 1. In both conditions, subjects had 
the option of including any, all, or no objects 
in their mapping, thus allowing the possibil- 
ity of a one-to-one, one-to-many, or many- 
to-one mapping, if so desired. Spellman and 
Holyoak found that subjects rarely produced 
one-to-many mappings (fewer than 2% of 
subjects), whereas they frequently produced 
many-to-one mappings (more than 30% of 
subjects). 

In addition to reproducing these results 
in the literature, Salvucci and Anderson 
had subjects try to determine the analogies 
between two stories and collected their eye 


THINKING AS A PRODUCTION SYSTEM 411 


Prevertetebby: Inttoss/iGitiHibnSEpIOROE 


ss-cause 


causes 
cause 
attracts 


attracts attracts 
attractor attracted 
sun planet 


ss-effect 


causes 
effect 
revolves 


revolves revolves 
revolver center 
planet sun 


TARGET 


at-cause 


causes 
cause 
attracts 


at-attractor 


attracts 
attractor 
nucleus 


attracts 
attracted 
electron 


at-effect 


causes 
effect 
revolves 


at-revolves 
at-center 


revolves 
center 
nucleus 


revolves 
revolver 
electron 


— 


Figure 17.2. Sample analogs for the solar-system and atom domains. 


movements while they were doing so. The 
data showed that subjects moved their eyes 
back and forth between the two stories as 
they read them and searched for the analogs. 
The Salvucci and Anderson model was able 
to predict the eye movement transitions. 
This is a critical study because it shows 
how analogy is dynamically integrated with 


cognition and how it can control — and 
be determined by — processes such as eye 
movements. 


Summary 


This production-system model of analogy 
specifies a set of production rules that 
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implement a path-mapping process through 
declaratively represented source and target 
structures. That is, the model posits that 
analogies are made and used via an ex- 
plicit process of path mapping that is in- 
fluenced by the relative activation levels of 
the elements to be mapped. The subsym- 
bolic mechanisms governing declarative re- 
trieval specify which parts of those declara- 
tive structures will be retrieved and when. 
In this way, the model makes specific, quan- 
titative predictions about the results of anal- 
ogy making and its time course (as observed 
through eye movement data). Although 
analogy making is a process that produces 
new knowledge — the mapping, which in 
turn can be used to produce new inferences — 
the process of analogy usually occurs in a 
single trial without much learning. Thus, Ta- 
blei7.5 highlights that this model of analogy 
making draws on three of the four perfor- 
mance mechanisms in ACT-R. 


Working Memory 


Just as the previous section’s model of anal- 
ogy makes heavy use of declarative knowl- 
edge and corresponding mechanisms, so 
does this section’s model of working mem- 
ory. Working memory has been implicated 
in the performance of such diverse tasks 
as verbal reasoning and prose comprehen- 
sion (Baddeley & Hitch, 1974), sentence 
processing (Just & Carpenter, 1992), free 
recall learning (Baddeley & Hitch, 1977), 
prospective memory (Marsh & Hicks, 1998), 
and note-taking and writing (Engle, 1994). 
This research has suggested that working- 
memory resources are limited because, as 
working-memory demands of a task in- 


crease, participants’ performance declines. 
Moreover, working-memory limitations ap- 
pear to differ across people such that some 
people show a more striking decrease in per- 
formance as a function of task demands than 
others (see also Morrison, Chap. 19). 

Each of the four production systems dis- 
cussed thus far has an account for the impact 
of working-memory demands on cognitive 
processing (see Miyake & Shah, 1999). EPIC 
implements Baddeley’s articulatory loop via 
production rules acting on the auditory store 
and vocal/motor processor. These produc- 
tion rules implement strategies for rehearsal 
and recall and are constrained by the pro- 
cessing features of the modules they engage 
(e.g., all-or-none decay of items from the au- 
ditory store and time to re-read an item by 
the vocal/motor processor). In contrast, Soar 
assumes no a priori limit to working memory 
through its dynamic memory.’ Rather, limi- 
tations arise when multiple levels of process- 
ing are necessary to establish multiple sub- 
goals to handle a sequence of impasses. In 
the CAPS architecture(s), working-memory 
limitations are captured through a limita- 
tion in the amount of activation that can 
propagate through the system: When less 
activation is available, production-rule fir- 
ing takes more processing cycles. CAPS 
has been used to model different pat- 
terns of sensitivity to working-memory de- 
mands among groups of individuals with 
low, medium, and high capacity (e.g., Just & 
Carpenter, 1992). 

In ACT-R, working-memory limitations 
are imposed via a limitation to the amount 
of attention that can be focused on the cur- 
rent goal. This attentional activation (also 
called source activation) serves to maintain 
elements of the goal as highly active, activate 
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facts in declarative memory, and suppress 
below their resting levels any facts negatively 
associated with the current goal. Although 
they sound similar, the CAPS and ACT-R 
limitations in activation are quite different. 
CAPS directly limits total activation in the 
system, whereas ACT-R limits the ability 
to differentially activate goal-relevant infor 
mation above goal-irrelevant information. 
In other words, greater source activation in 
ACT-R is akin to having a better signal-to- 
noise ratio for retrieving facts. It is worth 
noting that the working-memory limitations 
in both ACT-R and CAPS are imposed as 
constraints on a particular model parameter, 
whereas in other working-memory accounts 
(e.g., SOAR) the connectionist system LISA 
(Hummel & Holyoak, 1997) and, to some 
degree EPIC, these limitations emerge as a 
natural consequence of general processing. 

In this section, we demonstrate how im- 
plementation of working memory in ACT-R 
can be used to estimate individuals’ working- 
memory capacity from performance on one 
task (call it Task A) and then make accu- 
rate zero-parameter predictions of those in- 
dividuals’ performance on other tasks — B, C, 
and so on. Task A is a Modified Digit Span 
task (MODS) designed as an isomorph of the 
reading span (Daneman & Carpenter, 1980) 
and the operation span (Turner & Engle, 
1989). In this task, participants perform dual 
tasks of reading various presented charac- 
ters and memorizing the exact order of dig- 
its only. Figure 17.3 shows a sample MODS 
trial in which the participant would read 
“aj2 bie6cf8” and then recall the digits in 
order (2 6 8). Because the task is fast paced 
(there is little time for idiosyncratic strate- 
gies), it draws on skills that are highly prac- 
ticed (there is little chance for skill or knowl- 
edge differences), and both aspects of the 
task are monitored (there is little opportu- 
nity for different levels of task compliance), 
most of the variation in performance on this 
task should be attributable to differences in 
participants’ fundamental processing capac- 
ities. For our modeling purposes, we take this 
to be variation in source activation. 


Figure 17.3. Graphic illustration of a Modified 
Digit Span task trial with a memory set of size 3. 
The differences in the positions of the characters 
on-screen have been exaggerated for clarity. 


An ACT-R model of the MODS task 
successfully fits individual participant’s data 
as a function of set size (Figure 17.4) and 
as a function of serial position for the set 
size six trials (Figure 17.5) by only vary- 
ing the source-activation parameter (Daily, 
Lovett, & Reder, 2001). This suggests that 
source activation presents a reasonable im- 
plementation of working memory that can 
explain the variation in individuals’ MODS 
performance. Moreover, because source ac- 
tivation plays the same role in all ACT-R 
models, this allows for predictions to be 
made for the same participants on other 
tasks by plugging each participant’s esti- 
mated source-activation parameter into the 
other task models. In Lovett, Daily, and 
Reder (2000), this is accomplished for the 
n-back task. Specifically, individual partici- 
pant estimates of source activation were de- 
rived from their MODS task performance 
and then used to make zero-parameter, 
individual participant predictions on the 
n-back task. 

The n-back task is a continuous trial 
paradigm in which, fora given block of trials, 
participants are asked to respond to whether 
each letter stimulus is a repeat of the stim- 
ulus “n” trials back (e.g., Braver et al., 1997; 
Cohen et al., 1994). For example, suppose 
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Figure 17.4. Model fits for four representative subjects from Daily et al. 
(1999). Filled symbols are subject data; open symbols are the model’s 


predictions. 


the participant saw the stimuli “U EERE 
K LL”. In a “1-back” block, a participant 
should say “yes” to the third and last stimu- 
lus and “no” elsewhere, whereas in a “2-back” 
block, the participant should say “yes” to the 
fifth stimulus and “no” elsewhere. As “n” in- 
creases, the working memory demands of 
the task increase and, not suprisingly, per- 
formance degrades. Figure 17.6 shows high- 
fidelity modeling fits at the individual par- 
ticipant level in the n-back task by using the 
individualized source activation parameter 
values that were estimated from the same 
participants’ MODS performance. 


Summary 


This model of working memory includes 
production rules to perform the various tasks 
studied. Across all tasks, the ACT-R archi- 
tecture provides a single theory of working 
memory in which working-memory limita- 


tions are represented by a fixed amount of 
source activation, propagated from the cur- 
rent focus of attention to increase the acti- 
vation of goal-relevant items and to decrease 
the activation of goal-irrelevant items. The 
larger this source activation for a given in- 
dividual, the greater the degree of facilita- 
tion (suppression) of goal-relevant (irrele- 
vant) items. This leads to direct performance 
implications as a function of source activa- 
tion, plus there are indirect effects in the 
model (e.g., more rehearsals are possible be- 
cause of faster retrievals with high source ac- 
tivation) that can further the implications. 
In sum, this working-memory model relies 
most heavily on the relative activation lev- 
els of declarative chunks (both those that 
are part of the initial model and those that 
are newly acquired as part of task perfor- 
mance); this is highlighted by the check 
marks filling the declarative chunks row in 
Table 17.6. 
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Categorization 


Research on human category learning has a 
history that extends back at least to Hull’s 
(1920) study of learning to categorize Chi- 
nese symbols and his conclusions in favor 
of an associative learning proposal. It was 
an important domain early in the cognitive 
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revolution during which theorists argued 
for various hypothesis-testing theories (e.g., 
Trabasso & Bower, 1964; Levine, 1975). The 
hypothesis-testing theories were based on 
research with stimuli that had a very simple, 
often one-dimensional categorical structure. 
The 1970s saw a renewed interest in more 
complex, fuzzy categories and proposals for 


Subject 201 
W = 0.9 


SOE EOSCOoCC°o o> 
O=NWWKRUMDNOOO 


6 1 2 3 4 5 6 
Serial 
Subject 203 Position 
W = 1.1 
1.0 
0.9 
0.8 
8 
05 ve 
0.4 
0.3 
0.2 
0.1 
0.0 
6 1 2 3 4 5 6 
Serial 
Position 


Figure 17.5. Fits to the serial position data (largest set size only) for four typical 
subjects. Filled symbols are subject data; open symbols are the model’s 


predictions. 
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1975) and exemplar theories (e.g., Medin 
& Schaffer, 1978). The rise of connection- 
ist models resulted in the proposal of asso- 
ciative theories (e.g., Gluck & Bower, 1988) 
not that different from the original Hull 
proposal. Whereas the original research fo- 
cused on accuracy data, new emphasis has 
been on latency data to help choose among 
theories (eg., Lamberts, 1998; Nosofsky 
& Palmeri, 1997). Recently, neuro-imaging 
and other cognitive neuroscience data have 
been recruited to try to decide among al- 
ternative theories (e.g., Ashby, et al., 1998; 
Smith, Patalano, & Jonides, 1998). Impres- 
sive growth has been attained in the char 
acterizations of the phenomena in category 
learning (see Medin and Rips, Chap. 3). 
However, the field does not seem any closer 
to coming to consensus regarding what “the” 
mechanism of category learning is. 
Anderson and Betz (2001) produced a 
production system model that reflected the 
belief that this contest of theories was mis- 
placed and that different mechanisms were 
being used to different degrees in differ- 
ent experiments. In particular, they imple- 
mented two alternative models in ACT-R 
that have been advanced for categorization — 
Nosofsky, Palmeri, and McKinley’s (1994) 
rule-plus-exception (RULEX) model and 
Nosofsky and Palmeri’s (1997) exemplar- 
based random-walk (EBRW) model. The 
first model proposes that subjects store ex- 
plicit rules for category membership and 
possible exceptions. The EBRW model pro- 
poses that subjects retrieve instances that 
are similar to the test stimulus and assign 
the stimulus to the category that has the 
most retrieved exemplars after exceeding 
a particular threshold. Whereas the origi- 
nal models are mathematical characteriza- 
tions of participants’ behavior, the ACT-R 
model is a computational system that actu- 
ally performs the task. Production rules pro- 
vide the control structure for how the ACT- 
R model approaches the task (e.g., whether 
it employs a RULEX- or EBRW-based ap- 
proach), whereas declarative memory stores 
the rules, exceptions, and examples used and 


symbolic components of the architecture de- 
termine which production rules and declar- 
ative structures are retrieved at any time. 

The component of the model incorpo- 
rating an EBRW approach retrieves past in- 
stances from memory as a function of their 
similarity to the current stimulus. This de- 
pends critically on the ability of the ACT-R 
system to retrieve partially matching traces. 
Specifically, the probability of retrieving a 
memory in ACT-R is a function of how sim- 
ilar it is to the memory probe. Anderson and 
Betz (2001) show that this retrieval function 
yields a similar, but not identical, selection 
rule to that used in the original Nosfosky 
and Palmeri formulation. In addition, the 
ACT-R mechanism for chunk strengthening 
favors the retrieval of more frequently pre- 
sented items and therefore produces a speed 
increase similar to that in EBRW (which 
uses multiple traces and a Logan (1988) race 
process). Although the original EBRW and 
the ACT-R implementation are not identi- 
cal, they prove largely indistinguishable in 
their predictions. This near-equivalence is 
strongly dependent on the pre-existing sub- 
symbolic processes built into ACT-R. 

The component of the ACT-R model 
implementing a RULEX approach depends 
more on the symbolic production-level sys- 
tem because the actual logic of hypothesis 
testing in RULEX is quite complex (e.g., dif- 
ferent rules specify when to settle on a hy- 
pothesis, when to switch from single dimen- 
sion to multiple dimension rules, and when 
and how to form exceptions). Nevertheless, 
the subsymbolic level of ACT-R, which gov- 
erns the selection among production rules 
based on their ever-changing utility values, 
is essential for this model component to cap- 
ture the randomness of RULEX. Indeed, this 
noisy selection process enables this model 
component to reproduce the wide variety of 
hypotheses that subjects display. 

The Anderson and Betz effort is a rela- 
tively successful integration of the two mod- 
els. Moreover, the effort adds value over 
the two original models. First, it establishes 
that the two theories are not necessarily 
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Figure 17.6. N-back performance and model predictions for individual 
participants where parameter estimates of individuals’ working-memory 
capacities were derived from performance on the Modified Digit Span task. 


in opposition and in fact reflect the same 
underlying subsymbolic processes but with 
different symbolic control. Moreover, those 
subsymbolic processes are the same ones 
that can be used to model other, very dif- 
ferent domains of human cognition. Also, 
because both categorization mechanisms are 
able to sit within the same architecture, 
Anderson and Betz were able to address the 
issue of choice between the two mecha- 
nisms. This depends on the relative utility of 
these two mechanisms. Anderson and Betz 
show that the mixture of the two strate- 
gies is able to account for phenomena that 
cannot be accounted for by either strategy 
alone. They also show a natural tendency for 
this mixture of strategies to evolve from be- 
ing dominated by rule-based classification to 
being dominated by instance-based classifi- 


cation because the latter is more efficient. 
Figure 17.7 shows the tendency for exemplar 
use to increase in two of the models reported 
by Anderson & Betz. This increased exem- 
plar use is consistent with reported results 
of a strategy shift with extensive practice 
(Johansen & Palmeri, 2002). 


Summary 


This contemporary production-system 
model of categorization integrates two app- 
roaches (implemented as different sets of 
cohabitating production rules) and chooses 
between them (based on the production 
rules’ learned utility values). In one ap- 
proach, production rules are the conduit 
for creating and accessing exemplars (im- 
plemented as declarative chunks) in a 
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Figure 17.7. Proportion exemplar use across blocks in two data sets modeled in 


Anderson and Betz (2001). 


context-sensitive and frequency-sensitive 
way. In the other approach, production 
rules create and manipulate declarative 
rules for categorizing items. In all cases, the 
ACT-R subsymbolic learning mechanisms 
for production rules and declarative chunks 
govern how these kinds of knowledge are 
used. Table 17.7 highlights this (see checks 
in the right column) as well as the fact 
that this model employs ACT-R’s symbolic 
learning mechanism for declarative chunks. 


Skill Learning 


Research into skill learning can be roughly 
divided into two categories. One category 
focuses on how skills are learned in the first 


place (e.g., Catrambone, 1996; Chi et al., 
1989; VanLehn & Jones, 1993). The other 
focuses on how skills are refined to achieve 
domain expertise (see also Novick & Bassok, 
Chap. 14). Research in the former category 
has addressed issues of learning from in- 
struction, transfer, and induction. Research 
in the latter category has addressed issues 
of generalization, specialization, and auto- 
maticity. A unified approach merges these 
issues into a single explanation. Production- 
systems models — particularly those that ad- 
dress the question of production-rule learn- 
ing — hold the promise of offering such an 
explanation. 

Among production-systems models, Soar 
holds the most parsimonious view of 
skill learning, with its single mechanism, 


Table 17.7. This Model of Categorization Relies on Three Out of Four of ACT-R’s 


Learning mechanisms 


Performance Mechanisms Learning Mechanisms 
Symbolic Subsymbolic Symbolic — Subsymbolic 
Declarative chunks J ra J J 
Production rules af J J 
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Table 1 7. P Gwe Paes dns dhttpns Hie and ehedcuamed Child Production Rule 


Production A 


Production B 


Production C 


When the goal is to add the 
numbers x and y, then try to 
retrieve the sum of x and y 


When the goal is to add the 
numbers x and y and the sum 
of x and y has been retrieved as 


When the goal is to add 2 to 5, 
then update the goal with 7 
as the answer 


z, then update the goal with z 


as the answer 


chunking. Chunking is invoked whenever the 
system encounters an impasse (i.e., when ex- 
isting production rules do not directly spec- 
ify the next step). At this point, the system 
creates a subgoal to solve the impasse by 
applying domain-general production rules. 
Solving the impasse creates a new rule spe- 
cialized for that situation. A similar rule- 
learning process is employed by Cascade, 
a model of skill acquisition that incorpo- 
rates both the impasse-repair-reflect cycle 
and analogical problem solving (VanLehn, 
1999). After the new rule is learned, when 
Cascade subsequently encounters the same 
(or a related) situation, it can apply the new 
rule directly and avoid the extra processing. 
These models employ specialization — mak- 
ing a new rule that is a specific version of its 
parents — and composition — combining mul- 
tiple production rules into one new rule. 
ACT-R also has a production-rule learn- 
ing mechanism. This mechanism combines 
composition — merging two production rules 
that fire in sequence — and proceduralization — 
creating a new version of an existing produc- 
tion rule in which the new version avoids 
fact retrieval by instantiating necessary in- 
formation directly into the new rule. For ex- 
ample, consider a pair of production rules 
that solve addition problems of the form 
x + y =? by first retrieving the relevant 
addition fact from memory and then using 
this fact to make a response (A and B in 
Table 17.8). When these production rules 
are applied to the problem 2 +5 =?, asingle 
production rule is learned (C in Table 17.8) 
that combines the two steps into one but 
is specific to the case of 2 = 5. This mech- 
anism treats skill learning as a ubiquitous 
process of building more specific, more 
powerful, and less explicit problem-solving 


knowledge. Greater power comes from 
the knowledge’s being faster, no longer 
subject to retrieval failures, and incurring 
lower working-memory load. Less explic- 
itness comes from the fact that the new 
rule transforms a fully inspectable, declara- 
tive fact into the body of a production rule, 
where knowledge is not open to inspection. 

We exemplify ACT-R’s production-rule 
learning in the context of an experimental 
paradigm in which rule-like knowledge is 
learned in many different forms (Anderson 
& Fincham, 1994; Anderson, Fincham, and 
Douglass, 1997). This paradigm involves 
teaching participants a number of sports 
facts such as “Hockey was played on Satur- 
day at 3 pm and then on Monday at 1 PM.” 
After committing these sports facts to mem- 
ory, participants are told that each one con- 
veys a particular pattern or rule for the game 
times for that sport (e.g., Hockey’s second 
game time is always two days later and two 
hours earlier than its first). Participants are 
then given practice at using these sports facts 
to solve new problems in which either the 
first or second time is given and the other 
must be predicted. Figure 17.8a shows the 
speed-up in performance from Anderson & 
Fincham (1994) as participants practiced 
this over three days (each “bump” occurred 
at the beginning of a new day). Figure 17.8a 
also shows the predictions of an ACT-R sim- 
ulation (Taatgen and Wallach, 2002) that in- 
volves four representations of the sports-fact 
knowledge. 

Figure 17.8b tracks the contribution of 
these four sources of knowledge over the 
three days. The initial representation was 
simply the set of eight studied sports 
facts represented as declarative facts (see 
first row of Table 17.9). Specifically, each 
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Table 17.9. Medes 


Reptpset {itis hae Bjoums Facts from Anderson and Fincham (1994) 


Declarative vs. 


Knowledge Type Production How Generated Sports No. of Steps Required 
Original sports fact Declarative Original study 4 20 
components 
General relationships Declarative Analogy on original 2 X10 
sports fact 
components 
Procedural relation Production Production compilation 4 ~6 
rule on relationships 
Studied instance Declarative Result of previous 2 for each P 
(& often repeated) example 
example 


sports fact was represented in terms of four 
interrelated chunks to capture the two days 
and two times for that sport (e.g., “Hockey’s 
first event day was Saturday”, “Hockey’s first 
event time was 3”, “Hockey’s second event 
day was Monday”, “Hockey’s second event 
time was 1”). To solve problems using these 
facts, the model was endowed with a set 
of production rules representing the weak 
methods of direct retrieval (applicable for 
the original facts) and analogy. 

From this initial knowledge base, the 
model generated the other three representa- 
tions of the sports-fact knowledge. The first 
of these represents the rule-like relationships 
of each original sports fact as two declarative 
chunks (e.g., “Hockey’s day relationship is 
+2”, and “Hockey’s time relationship is 
—2”). The model produces this declaratively 
represented generalization as a byproduct of 
the analogizing process (see second row of 
Table 17.9). Once these generalized rela- 
tionships are derived, applying them to a 
new problem is much simpler than solving 
by analogy. The second new representation 
of knowledge comes in true production-rule 
form. Specifically, a new production rule 
is learned that merges the steps involved 
in applying the declarative generalizations 
just mentioned. Note that this production 
rule is specialized to the sport and direction 
(time 1 — time 2 or vice versa) under which 
it was generated. Such a directional produc- 
tion rule should show faster performance 
for problems in the practiced direction, 


and Anderson and Fincham showed that 
such asymmetry develops with extensive 
practice. 

The third new representation is a specific 
instance representing the solution to a par- 
ticular previous (and often repeated) prob- 
lem. This knowledge can complete a new 
problem in just two steps (one each for the 
day and time). However, it is specific to a 
particular problem and is only generated af- 
ter the preceding forms of knowledge have 
paved its way. It predicts that participants 
will be faster on frequently repeated prob- 
lems, and Anderson, Fincham, and Dou- 
glass (1997) provide evidence for such item- 
specific learning. 


Summary 


The most noteworthy aspect of this 
production-systems model of skill learning 
is that it posits multiple, overlapping stages 
in the development of a new skill, some of 
which represent the new skill knowledge in 
production-rule form and some of which do 
not. Because of the acquisition of new pro- 
duction rules and new declarative chunks, 
the model relies on both symbolic learning 
mechanisms in ACT-R. In addition, these 
new knowledge representations are refined 
and strengthened through experience, draw- 
ing on ACT-R’s subsymbolic learning mech- 
anisms. Finally, the model chooses among 
the different knowledge representations via 
the subsymbolic performance mechanisms: 
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Figure 17.8. Latency to respond across trials in each session in Anderson and 
Fincham, 1994 (panel a), and proportion of simulation runs in which particular 
knowledge representations were used across trials in Taatgen and Wallach, 2002 


(panel b). 


As declarative representations are strength- 
ened through use, those with higher activa- 
tion will tend to get retrieved, and as new 
production rules are used and are successful, 
those with higher utilities will tend to get 
chosen (over more generic production rules 
that employ declarative representations). In 
sum, this model draws on all eight mecha- 
nisms presented in Table 17.10. 


Language Learning: Past Tense 


The learning of the English past tense is an- 
other domain in which symbolic and sub- 
symbolic models have clashed. The appear- 
ance of over-regularization errors in chil- 
dren’s past tense (e.g., go-goed as opposed 
to go-went) had been originally taken as 
evidence (e.g., Brown, 1973) that children 
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were acquiring abstract rules. However, 
Rumelhart and McClelland (1987) showed 
that by learning associations between the 
phonological representations of stems and 
past tense it was possible to produce a 
model that made overgeneralizations with- 
out building any rules into it. It was able 
to account for the U-shaped learning func- 
tion demonstrated by children by which they 
first do not produce such overgeneralization, 
then do, and finally, gradually eliminate the 
overgeneralizations. This attracted a great 
many critiques and, although the fundamen- 
tal demonstration of generalization without 
rules stands, it is acknowledged by all to be 
seriously flawed as a model of the process 
of past-tense generation by children. Many 
more recent and more adequate connection- 
ist models (some reviewed in Elman et al., 
1996) have been proposed, and many of 
these have tried to use the backpropogation 
learning algorithm. 

This would seem like an appropriate do- 
main for production-system models, and 
Taatgen and Anderson (2002) have pro- 
duced a successful model of these phenom- 
ena. Significantly, they show that one can ac- 
count for past-tense learning with a similar 
dual mechanism model like that of Ander- 
son and Betz (2001). The model posits that 
children initially approach the task of past- 
tense generation with two strategies. Given 
a particular word like “give,” they can either 
try to retrieve the past tense for that word or 
they can try to retrieve some other example 
of a past tense (e.g., live-lived) and try to 
apply this by analogy to the current case. In 
the case of analogy, previously encountered 
present—past tense pairs serve as potential 
sources, and a source that has a present tense 
form similar to the target’s present tense 
form will be retrieved. Then, the transfor 


mation driving the past-tense form in the 
retrieved source is applied to the tar- 
get. Eventually, through the production-rule 
learning mechanisms in ACT-R, the analogy 
process will be converted into a production 
rule that generatively applies the past-tense 
rule. Once the past-tense rule is learned, the 
generation of past tenses will be determined 
largely by a competition between the gen- 
eral rule and retrieval of specific cases. Thus, 
ACT-R has basically a dual-route model of 
past-tense generation in which both routes 
are implemented by production rules. The 
rule-based approach depends on general 
production rules whereas the exemplar ap- 
proach depends on the retrieval of declara- 
tive chunks by production rules that imple- 
ment an instance-based strategy. 

Figure 17.9 graphically displays the vari- 
ety of ways this model can generate the past 
tense. Although all of these options are im- 
plemented in ACT-R production rules, only 
the two rightmost options represent the ap- 
plication of general past-tense rules (e.g., add 
“ed”). The second and third options initi- 
ate procedures for retrieving a memory trace 
that can then be applied directly or by anal- 
ogy to the current situation. 

The general past-tense rule, once discov- 
ered by analogy, gradually enters the com- 
petition as the system learns that this new 
rule is widely applicable. This gradual entry, 
which depends on ACT-R’s subsymbolic 
utility-learning mechanisms, is responsi- 
ble for the onset of overgeneralization. Al- 
though this onset is not all-or-none in either 
the model or the data, it is a relatively rapid 
transition in both model and data and cor- 
responds to the first turn in the U-shaped 
function. However, as this is happening, the 
ACT-R model is encountering and strength- 
ening the declarative representations of 
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Figure 17.9. Different choices the model can make in generating the past tense. 
Each option is executed by the firing of a production rule, but only the two 
rightmost options actually implement a generalized rule. ACT-R’s 
production-rule competition and learning mechanisms govern the model’s 


selection among these options. 


exceptions to the general rule. Retrieval 
of the exceptions comes to counteract 
the overgeneralizations. Retrieval of excep- 
tions is preferred because they tend to 
be shorter and phonetically more regu- 
lar (Burzio, 2002) than regular past tenses. 
Growth in this retrieval process corresponds 
to the second turn in the U-shaped function 
and is much more gradual — again, both in 
model and data. 

Note that the Taatgen model, unlike 
many other past-tense models, does not 
make artificial assumptions about frequency 
of exposure but learns given a presentation 
schedule of words (both from the environ- 
ment and its own generations) like that ac- 
tually encountered by children. Its ability 
to reproduce the relatively rapid onset of 
overgeneralization and slow extinction de- 
pends critically on both its symbolic and 
subsymbolic learning mechanisms. Symbol- 
ically, it is learning general production rules 
and declarative representations of excep- 
tions. Subsymbolically, it is learning the util- 
ity of these production rules and the activa- 
tion strengths of the declarative chunks. 

Beyond just reproducing the U-shaped 
function, the ACT-R model explains why 
exceptions should be high-frequency words. 
There are two aspects to this explanation. 
First, only high-frequency words develop 
enough base-level activation to be retrieved. 
Indeed the theory predicts how frequent 
a word has to be to maintain an excep- 
tion. Less obviously, the model explains 
why so many high-frequency words actu- 
ally end up as exceptions. This is because the 
greater phonological efficiency of the irregu- 
lar form promotes its adoption according to 


the utility calculations of ACT-R. Indeed, in 
another model that basically invents its own 
past-tense grammar without input from the 
environment, Taatgen showed that it will de- 
velop one or more past-tense rules for low- 
frequency words but will tend to adopt more 
efficient irregular forms for high-frequency 
words. In the ACT-R economy the greater 
phonological efficiency of the irregular form 
justifies its maintenance in declarative mem- 
ory if it is of sufficiently high frequency. 

Note that the model receives no feed- 
back on the past tenses it generates un- 
like most models but in apparent correspon- 
dence with the facts about child language 
learning. However, it receives input from the 
environment in the form of the past tenses 
it hears, and this input influences the base- 
level activation of the past-tense forms in 
declarative memory. The model also uses 
its own past-tense generations as input to 
declarative memory and can learn its own er- 
rors (a phenomenon also noted in cognitive 
arithmetic by Siegler, 1988). The amount of 
overgeneralization displayed by the model 
is sensitive to the ratio of input it receives 
from the environment to its own past-tense 
generations. 


Summary 


Although this model of past-tense gener- 
ation fully depends on the existence (and 
emergence) of rules and symbols, it also criti- 
cally depends on the subsymbolic properties 
of ACT-R to produce the observed graded 
effects. Table 17.11 highlights the fact that 
this model relies on learning of both declar- 
ative and procedural knowledge at both the 
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tic position enables the model to achieve 
a number of other features not attained by 
many other models: 


1. It does not have to rely on artificial as- 
sumptions about presentation frequency. 

2. It does not need corrective feedback on its 
own generations. 


3. It explains why irregular forms tend to be 
high frequency and why high-frequency 
words tend to be irregular. 

4. It correctly predicts that novel words will 
receive regular past tenses. 


5. It predicts the gradual onset of overgen- 
eralization and its much more gradual 
extinction. 


Conclusions and Future Directions 


This chapter describes six production- 
systems models accounting for six different 
areas of cognition: problem-solving choice, 
analogy making, working memory, catego- 
rization, skill learning, and past-tense learn- 
ing. In some cases, an important contri- 
bution of the model lies in specifying a 
production system that implements a fairly 
general reasoning strategy (eg., analogy 
making and categorization). The analogy 
model specifies a path-mapping process as 
a set of production rules. The categorization 
model specifies two processes for categoriza- 
tion — by rules (with exceptions) and by re- 
trieving multiple category exemplars — both 
implemented as sets of production rules that 
cohabit a single production system. In both 
models, it is not only the production rules 
that govern model behavior but also sub- 
symbolic quantities that influence how the 
production rules do their work. In the anal- 
ogy model, subsymbolic activation levels as- 
sociated with different declarative chunks 
influence which parts of the analogy will 
get mapped and when; in the categoriza- 
tion model, subsymbolic utility levels asso- 
ciated with different production rules influ- 
ence which categorization approach will be 
chosen and when. 


of the models is specifying how multiple 
strategic approaches to a given task can be 
integrated. Indeed, a common but often un- 
deremphasized feature of high-level cog- 
nitive tasks is that people can approach 
them in so many ways. The problem-solving 
model addresses this issue of choice directly 
and illustrates a modern interpretation of 
production-rule conflict resolution. Specif- 
ically, this model (along with the categoriza- 
tion, skill-learning, and past-tense—learning 
models) demonstrates that a noisy selection 
of the production rule with highest utility 
(where utility is naturally learned through 
experience by the system) works well to 
choose among different strategies. 

A related contribution made by some 
of these models is making clear that rule- 
like thinking is not always best represented 
in terms of production rules. The categoriza- 
tion, skill-learning, and past-tense—learning 
models all use multiple strategic approaches; 
in the latter two models, one of the ap- 
proaches is based on production-rule repre- 
sentations of knowledge and another is not 
based on production-rule representations of 
knowledge. Together, the two representa- 
tional forms complement each other in a way 
that accounts for the variability in people’s 
behavior. Accounting for variability is prob- 
ably the most notable contribution of the 
working-memory model given that it posits 
a theory of working-memory limitations that 
can be used to estimate individuals’ working- 
memory capacities and then predict other 
task performance on that basis. 

What is most striking about these mod- 
els as a whole, however, is that they make 
use of the same set of mechanisms for learn- 
ing and using knowledge across such a dis- 
parate set of tasks and that they use the same 
two kinds of knowledge representations — 
production rules and declarative chunks. Al- 
though each model emphasizes a somewhat 
different subset of mechanisms (compare 
Tables 17.4-17.7, 17.10, and 17.11), they all 
fit together in a unified architecture, just as 
the many processes of human cognition all 
must fit together in the human brain. Like- 
wise, modern productions systems offer an 
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understanding of how the many components 
of cognition are integrated. 


Production Systems into the Future 


Given the progress represented by the rela- 
tively few models presented here, it is worth- 
while to speculate how production systems 
will continue to be involved in future re- 
search on cognition. Two areas in which pro- 
duction systems have ventured in the past 
few years are already showing initial levels 
of success and promise to play a large role in 
future developments in modeling. 

One of these areas involves the devel- 
opment of production-system models that 
can handle complex tasks. Complexity can 
arise in many dimensions, but one involves 
the dynamic qualities of the task. Air-traffic 
control is a dynamic task in that it requires 
continuous attention to changing stimuli and 
changing task demands. It is complex in that 
it requires the integration of multiple areas 
of knowledge (e.g., different skills to handle 
the different situations) and the integration 
of perceptual, motor, and cognitive process- 
ing (eg., actually working with a graphical 
and keyboard interface akin to what real air- 
traffic controllers use). A modeling compe- 
tition to account for human data of vari- 
ous sorts on an air-traffic-control task called 
AMBR (Agent-Based Modeling and Behav- 
ior Representation) set the bar high with re- 
gard to modeling a complex, dynamic task 
(Gluck & Pew, 2001). Several production- 
system models, including one built within 
ACT-R (Lebiere, Anderson, and Bothell, 
2001) and one built within an EPIC—Soar 
combination (Chong and Wray, 2002), took 
on the challenge and demonstrated success 
in accounting for the number and type of 
errors of human controllers and, in the case 


of ACT-R, similar levels of variability in per- 
formance among human controllers. Similar 
successes are beginning to arise in real world 
applications of production-systems models. 
For instance, there is the Soar model that 
managed to fly 50 Air Force training mis- 
sions (Jones et al., 1999), and other exam- 
ples of production-systems models used in 
industry and military applications will likely 
become more the rule than the exception 
(pardon the pun!). Some of these will likely 
come in the form of cognitive agents (e.g., 
Best, Lebiere, and Scarpinatto, 2002) that act 
in virtual worlds (e.g., for training purposes 
with humans) or real environments (e.g., in 
coordination with robotic systems). 
Another area of current growth in 
production-systems modeling that promises 
to expand is the integration of production- 
systems models (i.e., their components and 
their predictions) with neuroimaging work. 
With the growth of functional neuroimaging 
as a means of studying cognition (see Goel, 
Chap. 20), the field of cognitive modeling 
has another dependent measure for test- 
ing models’ predictions. To take advantage 
of this additional constraint, however, the 
model must posit some mapping between 
model output and neuroimaging results. A 
direct approach is to map brain locations to 
model functions and then predict localiza- 
tion of activation on the basis of model func- 
tions. This basic approach can be elaborated 
to account for the time course of brain activ- 
ity either at a coarse-grain size (e.g., predict- 
ing differential localization of activity early 
versus late in the task or between conditions) 
or at a fine-grain size (e.g., within a single 
trial). Both of these approaches have been 
used in tasks ranging from language process- 
ing (Just, Carpenter, and Varma, 1999) to 
equation solving (Anderson, et al., in press), 
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switching (Sohn et al., 2000). The goal of 
this work is to improve our understanding 
of brain function and also of the mapping 
to production-system models of thought. 
Whereas production systems have tended to 
describe cognitive processes at a high level of 
abstraction, the trend has been toward more 
and more fine-grained models, so it is now 
becoming appropriate to consider the neu- 
ral processing implications of many of the 
issues in production-system models. 
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Notes 


1. Interestingly, recent production-system mod- 
els have returned to embrace an individual 
participant approach with quantitative analy- 
ses (e.g., Daily, Lovett, & Reder, 2001; Lovett, 
Daily, & Reder, 2000). 

2. Asynchronous parallelism means that each per- 
ceptual/motor module can work in parallel 
with others in such a way that the actions of 
each need not be synchronized with actions of 
the others. 

3. However, Young and Lewis (i999) have 
posited that no more than two items with the 
same type of coding (e.g., phonological or syn- 
tactic) can be stored in dynamic memory at 
a time. 
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CHAPTER 18 


Implicit Cognition and Thought 


Leib Litman 
Arthur S. Reber 


Introduction 


The debate about the existence of an uncon- 
scious mental life is as old as psychology it- 
self. An overview of the contemporary opin- 
ions about the nature of the unconscious 
shows that, despite countless studies, the 
opinions expressed by the researchers work- 
ing in this field are as diverse today as they 
were when the debates began. The spectrum 
of opinions range from a profound convic- 
tion that a significant aspect of mental life is 
embodied in the unconscious (Erdelyi, 1985; 
Pyszczynski, Greenberg, & Solomon, 1999; 
Reber, 1993) to opinions that stress that the 
very idea of a complex and abstract uncon- 
scious mental life is virtually a contradiction 
of terms (Perruchet & Vinter, 2003; Shanks 
& St. John, 1994). Despite the divergence in 
the range of opinions, findings such as the 
following certainly suggest that the notion 
of unconscious thought has to be taken seri- 
ously, even by the most skeptical. 


* Subjects learn to differentiate seemingly 
nonsensical sequences of letters that fol- 
low complex rules from those that violate 


the rules despite being unaware of the na- 
ture or even the existence of the rules 
(Reber, 1967, 1993). 


Participants display analogic transfer from 
one complex problem to another with- 
out awareness of the commonalities in the 
two sets of conditions (Schunn & Dunbar, 
1996). 

Infants show a similar ability to dis- 
criminate rule-governed patterns of pho- 
netic and visual elements from those 
that violate the rules (Gomez & Gerken, 
1999, 2000; Saffran, Aslin, & Newport, 
1996). 

fMRI data show that brain areas that 
normally process particular stimuli are 
activated even when the words are 
presented subliminally (Naccache & 
Dehaene (2001). 

¢ Amnesiacs, who have lost the ability to 
form conscious memories, display im- 
proved performance on a variety of tasks 
over time, suggesting that the past ex- 
periences are unconsciously represented 
and are influencing thoughts and behav- 
ior (Glisky & Schacter, 1986). 
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words that were read to them while under 
anesthesia (Kihlstrom et al., 1990). 


¢ Subjects with neurological disorders that 
limit their ability to perceive certain areas 
of the visual field nevertheless respond to 
stimuli that are presented in those areas 


(Weiskrantz, 1986). 


No one of these findings is in itself con- 
clusive. Arguments have been made that at- 
tack the assumptions and techniques used 
in each case. In a recent exchange, Perruchet 
and Vinter (2003) debate these and other is- 
sues with upwards of three dozen commen- 
tators and critics. 

Here, we present an overview of the ba- 
sic findings that support the existence of 
a sophisticated cognitive system that oper- 
ates largely independent of consciousness, 
discuss the criticisms mounted against these 
findings, and outline the various approaches 
that have been taken in an attempt to 
overcome those critiques. Included in our 
overview are areas such as implicit learning 
and memory, subliminal perception, the role 
of attention, and the impact implicit pro- 
cesses have on problem solving and creativ- 
ity. Rather than providing a thorough review 
of the findings in any one of these areas, our 
focus is on developing a conceptual theme 
that subsumes the diverse field of the studies 
involved in investigating implicit cognition. 


What Implicit Implies 


What, exactly, do we mean by implicit and 
explicit thought? Traditionally the terms “im- 
plicit” and “explicit” have been treated as 
synonyms of the terms “conscious” and “un- 
conscious.” Implicit knowledge is defined as 
being: “unconscious, covert, tacit, hence of 
a process that takes place largely outside of 
the awareness of the individual; the term is 
used in this sense to characterize cognitive 
processes that operate independently of con- 
sciousness” (Reber & Reber, 2001). The use 
of the terms implicit and explicit in this con- 
text refers to states of consciousness. To say 


represented in memory, for example, is to 
say that it can be consciously recalled or rec- 
ognized. When someone explicitly recalls a 
friend’s name, that name, at the time of re- 
call, is consciously represented. 

Implicit thought on the other hand is un- 
conscious, and the content of a memory is 
considered to be implicit when it exerts its 
influence on thought or action even though 
it cannot be recalled. Claparéde (1911/1951) 
described the classic case of an amnesic pa- 
tient whom he pricked with a concealed 
needle in his palm. The patient, of course, 
forgot what happened almost immediately 
after the incident in the sense that she no 
longer had explicit memory for the event 
or even, for that matter, Claparéde. How- 
ever, when he attempted to shake the pa- 
tient’s hand some days later she, surpris- 
ingly, refused exclaiming, “One never knows 
what people carry around in their hands.” 
Experience with this patient, as well as hun- 
dreds of others (Cohen & Squire, 1980; 
Schacter, 1987; Scollville & Milner, 1957), 
shows that memory for events can influence 
both thought and behavior even when that 
memory is no longer available for conscious 
inspection. 

Describing implicit and explicit thought 
merely in terms of subjective states, how- 
ever, is hardly sufficient to capture the dis- 
tinctions between them. Implicit represen- 
tations are, at least in some ways, not only 
unconscious, but are also thought of at least 
by some theorists (e.g. Perruchet & Vinter, 
2003), as being of a different form than con- 
scious representations. 

Implicit knowledge may be stored in a dif- 
ferent form from when it appears in con- 
sciousness. Here’s an argument put forward 
by Perruchet and Vinter (2003): imagine a 
computer representation of a pencil. When 
a picture of the pencil is presented on the 
computer screen the entire pencil is repre- 
sented. However, off the screen, there is only 
a bit-wise representation that in no way re- 
sembles its on-screen form. Inside the com- 
puter you will not find a picture of a pencil as 
such. The bits representing the pencil might 
exist in very different parts of the hard drive 
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like the picture of the pencil when it is pro- 
cessed and brought up to the screen. By anal- 
ogy, tacit knowledge may not contain a full 
representation of objects — this can only hap- 
pen in consciousness — that is, on the screen. 
This position is consistent with the Lockean 
notion that true mental representations are 
only possible in consciousness. 

Most computational models, in one way 
or another, have endorsed the perspective 
that unconscious representations are not 
identical to conscious ones. In Anderson’s 
ACT-R model, implicit memory repre- 
sents subsymbolic information that oper- 
ates by controlling access to explicit declara- 
tive knowledge (Anderson & Lebiere, 2003; 
Lovett & Anderson, Chap, 17). Various 
other models that are built on connec- 
tionist architectures (Dienes, 1992; Cleere- 
mans, 1993; Servan-Shreiber, Cleeremans, 
& McClelland, 1991) share the notion that 
implicit knowledge is nonrule based and 
nonsymbolic. 

The one thing that is clear here is that 
there is no consensus as to whether implicit 
knowledge can be symbolic. Our position is 
that it can. We will have more to say about 
the nature of the representations of im- 
plicit knowledge later. For now, keep in mind 
these two nuanced entailments of implicit: 
the conscious quality of knowledge and the 
extent to which the knowledge is symboli- 
cally represented. Although often used inter- 
changeably, the two senses do not perfectly 
overlap. When processes such as rule use 
and symbol manipulation are discussed they 
are typically assumed to be conscious, top- 
down operations. However, there is no a pri- 
ori reason to conclude that the unconscious 
cannot manipulate symbols and use prede- 
fined rules. We favor the notion that im- 
plicit thought can be based on abstract rep- 
resentations and that such knowledge is not 
only possible but is responsible for much of 
the complexity and adaptiveness of human 
behavior (Lovett & Anderson, Chap. 17). 
However, much evidence has been pre- 
sented both for and against this view, and 
in what follows we provide an overview of 
this research. 


How can we know exactly when a mental 
process is unconscious? In examining this 
question it is useful to make a distinction be- 
tween two stages of information processing: 
encoding and retrieval. The importance of 
this distinction is that a suitable method- 
ology for demonstrating the implicitness at 
one point is not applicable at another. Con- 
sider research with amnesiacs in which the 
evidence for tacitly held knowledge is found 
at the retrieval stage (i.e, a patient per- 
forms better over time but does not remem- 
ber the training session) but not at encod- 
ing (i.e, amnesiac patients are consciously 
aware of what they are learning while they 
are learning it). Demonstrating that mem- 
ory is not used consciously at the time of re- 
trieval entails a different methodology than 
that used to demonstrate that the encod- 
ing process was unconscious. For encoding, 
it is necessary to demonstrate that at the 
moment of the presentation of the stimu- 
lus, the subject’s awareness of that stimulus 
was deficient. 

It turns out that demonstrating a lack of 
awareness at encoding is deeply problem- 
atic. How do we know that the stimulus 
was really not consciously perceived in some, 
perhaps minimal, way? Because the deci- 
sion as to whether the subject consciously 
perceived the stimulus relies on one or an- 
other form of subjective self-report, we are 
stuck with having to rely on nonverifiable 
measurement. 


Accessibility and Availability 


Imagine a typical subliminal perception ex- 
periment in which subjects are given brief or 
masked exposure to objects or words. Later, 
knowledge of the target is assessed either by 
direct tests (what word did you see?) or by 
indirect tests (changes in response times over 
time). Note that there are actually two con- 
structs here: the accessibility of the informa- 
tion at encoding and the availability of the 
stored knowledge sometime after presen- 
tation. Some (Brody, 1989; Eriksen, 1959) 
maintain that unconscious perception can 
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ity of the stimulus at the time of encoding is 
zero. In other words, there should be no dif- 
ference in accessibility of a stimulus between 
the subject and a blind person. This criterion 
is very difficult (if not impossible) to achieve 
because it is always possible that something 
was perceived consciously and chance per- 
formance is attributable to subjects’ not be- 
ing sufficiently confident to make a response 
based on what little they did see. 

Alternatively, as suggested by Erdelyi 
(1986, 2004), we can require that the avail- 
ability of the stimulus to consciousness be 
greater than the extent to which that stim- 
ulus was consciously accessible at the time 
of encoding. This approach has an impor 
tant advantage in that it reveals a critical but 
often unrecognized fact: the impact of un- 
conscious knowledge on behavior is a con- 
tinuum and not an either/or issue. 


The Implicit/Explicit Continuum 


It is erroneous to say that a behavior has to 
be explained in its entirety by either con- 
scious or unconscious input. Most cognitive 
tasks, including perception, memory, prob- 
lem solving, and creativity are products of 
the influences of both conscious and uncon- 
scious processes. The existence of conscious 
factors does not in any way preclude the 
further influence of unconscious ones. The 
findings of two important experiments help 
make this point. 

In the first, Mathews et al. (1989) had 
experimental subjects engage in an implicit 
learning task over a four-day period. The 
study used what is known as an artificial 
grammar (AG). An example of a typical AG 
is given in Figure 18.1 along with several let- 
ter strings that it can generate and a num- 
ber of nongrammatical or not well-formed 
strings that contain a single letter violation. 
Itis apparent that the system is complex and, 
as Mathews et al., found, not easy to de- 
scribe. In the canonical AG learning study, 
subjects memorize a number (perhaps 15 
or 20) of exemplary letter strings and then, 
using what knowledge they acquired from 
the learning phase, attempt to distinguish 


or not — that is, whether or not they conform 
to the rules that generated the original set. 

The clever twist that Mathews and his 
colleagues used was to stop their partici- 
pants from time to time during the “well- 
formedness” phase and ask them to expli- 
cate, in as much detail as possible, what they 
knew and how they were making their de- 
cisions. Transcripts of their responses were 
then given to four different yoked-control 
groups who, without having any learning ex- 
perience, were asked to classify the same 
strings. If subjects were consciously aware of 
the knowledge they had acquired and could 
communicate it, the yoked subjects should 
perform at the same level as the experimen- 
tal participants. 

Mathews et al. found that their exper- 
imental subjects could make reliable de- 
cisions on the very first day of the study 
but were remarkably inept in communicat- 
ing what they had learned — yoked sub- 
jects working with the Day 1 transcripts 
performed at chance. However, as the ex- 
periment progressed, the experimental sub- 
jects’ ability to verbalize their knowledge 
improved dramatically. In fact, the yoked 
subjects who received the Day 4 transcripts 
made decisions nearly as well as the experi- 
mental participants. Interestingly, the exper- 
imental subjects’ performance on the pri- 
mary task didn’t improve significantly after 
the second day although their ability to ex- 
plicate what they knew did. This is an ex- 
ample of how knowledge is encoded implic- 
itly but over time becomes explicit and can 
then be retrieved consciously. Another im- 
plication of Mathews teams’ study is that the 
implicit and the explicit are bound up in a 
delicate synergy and we would be wise to 
refrain from all-or-none distinctions. 

In the second study, Knowlton and Squire 
(1994) showed that even when conscious 
knowledge is fully available (i.e. in Erdelyi’s 
[1986] terms, accessibility equals availabil- 
ity) it doesn’t mean that it is necessar- 
ily being utilized at all times. They found 
that amnesic patients’ performance was in- 
distinguishable from normal controls on a 
standard AG learning task, as described 
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Preentatedby: Hnttpss//gérti é 


Well-Formed Strings 


PVPXVPS 
TSSXXVPS 
TSXS 
PTVPXVV 
PTTTVV 


PVPXVPXVV 


egntom 


Strings with a single-letter violation 


PTTVVPS 


TXXTXPS 


TSSXV 


VTTVV 


TSSTVV 


PTTTVPV 


Figure 18.1. A typical artificial grammar used in many studies of implicit 
learning. The grammar generates letter strings by following the arrows from the 
input state (S,) to the terminal state (S,). Several examples of “well-formed” 
strings are presented along with others that contain a violation of the grammar. 


previously, suggesting that representations 
of knowledge need not be held in a con- 
scious form to be used to make decisions. 
However, when both groups were encour 
aged to make decisions by utilizing any sim- 
ilarities between the test stimuli and those 
used during learning, the two groups dif- 
fered. The normal controls showed a small 
but significant improvement, whereas the 
patient group’s performance, perhaps not 
surprisingly, actually diminished. 

There are two implications of these stud- 
ies. First, in these tasks, normal subjects pos- 
sess a delicate balance of implicit and ex- 


plicit knowledge and how each is manifested 
depends as much on the task demands as 
on the accessibility of conscious knowledge. 
Second, amnesic patients, whose neurolog- 
ical injuries have compromised their abil- 
ity to form consciously accessible long-term 
memories, can still carry out complex im- 
plicit learning tasks. The balanced synergy 
is largely missing in this population, but 
the implicit system appears to be relatively 
intact. 

In short, it is both theoretically sounder 
and methodologically more plausible to look 
at the impact of implicit knowledge as 
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ask whether or not a particular task was im- 
plicitly or explicitly performed, one should 
examine the extent to which both implicit 
and explicit factors are playing a role in the 
behavior in question. With this framework 
in mind, let’s explore the several domains in 
which unconscious mechanisms have been 
examined. 


Consciousness at Encoding 


DIVERTED ATTENTION 


One interesting aspect of attention, at least 
for the purposes of implicit processes, is that 
attention and consciousness are highly corre- 
lated. When something is being attended to, 
for example, the words of this sentence, the 
object in the focus of attention becomes con- 
scious. Of course, because of the limits on 
attention, some, if not most, of the sensorial 
events in the outside world are not within 
the focus of our attention. When reading a 
book, we “tune out” much of the outside 
world such as conversations and, traffic. In- 
deed, we ignore most of the events that are 
outside of the attentional focus. The ques- 
tion that almost asks itself is, What effect do 
the unattended, nonconscious events in our 
environment have on us? Do they get regis- 
tered unconsciously in some fashion with- 
out our awareness, or are the effects of 
unattended events trivial and only become 
important when and if they are consciously 
attended to? 

The effects of diverting attention from 
the stimulus at encoding are usually stud- 
ied in the context of a dual-task paradigm in 
which attention is diverted by a secondary 
stimulus (Morrison, Chap. 19). For example, 
in the classic dichotic-listening task two dif- 
ferent messages are played, one to each ear. 
One message is attended to, the other not — 
although the secondary message often con- 
tains important information. Afterwards, a 
simple memory or priming task is used to 
discover the effects of diverted attention. 

The initial findings here suggested that, 
when attention is diverted from a stimu- 
lus, the effect of that stimulus is greatly 
reduced (Broadbent, 1958; Cherry, 1953; 


HiaNary Moray, 1959). For example, Johnson and 


Wilson (1980) presented ambiguous words 
like “sock” to one ear while a disambiguat- 
ing word (“foot” or “punch”) was presented 
in the other. They found that the amount 
of attention allocated to the encoded stimu- 
lus was critical. When the instructions were 
to attend to both channels, the word “foot” 
facilitated the interpretation of the ambigu- 
ous homophone “sock.” But when attention 
was directed to the channel in which the tar- 
get words were presented, items in the unat- 
tended channel did not influence the per- 
ceived meaning of the targets. 

Later studies, however, questioned this 
conclusion. Eich (1984) presented subjects 
with homophones such as “fare/fair” in one 
ear and a modifier of the less frequent mean- 
ing (e.g., “taxi”) in the other. Subjects were 
then given a recognition test for the mod- 
ifiers and were asked to spell the target 
words (“fare” or “fair”). Eich found a clear 
impact of implicit memory. Despite being 
virtually at chance on the recognition task, 
subjects showed a strong tendency to spell 
the test words according to their primed, 
but less common, meaning. Similar find- 
ings were reported in a series of studies by 
Mulligan (1997, 1998) in which subjects 
memorized word lists while their attention 
was diverted by a secondary task involving 
repeating strings of digits of varying lengths. 
Attentional load was manipulated by vary- 
ing the length of the digit string from three 
to seven. Mulligan found that increasing 
the attentional load impaired explicit mem- 
ory performance for the original words (us- 
ing cued recall) but had essentially no ef- 
fect on implicit memory (as measured by 
stem completion). It seems therefore that 
at-encoding stimuli have an impact on subse- 
quent performance even when they are not 
consciously perceived. 


UNDER ANESTHESIA 


Although diverting attention certainly re- 
duces the likelihood of the stimuli’s be- 
ing consciously encoded, Kihlstrom et al. 
(1990) made quite certain that their input 
stimuli were not being attended to. Their 
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presented with a repeating list of stimulus 
items while completely anesthetized. Af- 
ter surgery, although patients had no ex- 
plicit memory for the material, an implicit, 
free-association test showed that informa- 
tion presented during anesthesia was en- 
coded. Shanks and St. John (i994) criti- 
cized these and other studies on the grounds 
that they produced mixed findings and of- 
ten used questionable methodology. How- 
ever, recently, Merikle and Daneman (1996) 
conducted a meta-analysis of 44 studies with 
several thousand participants and concluded 
that, taken as a whole, these studies support 
the argument that items presented during 
anesthesia can have an impact on postsur- 
gical tests. Questions still remain however 
with regard to the possibility that at least 
some of the subjects were partially conscious 
during stimulus presentation. 


SUBLIMINAL PERCEPTION 


Historically, subliminal perception studies 
have been controversial. They have ranged 
from the embarrassing “eat popcorn — drink 
coke” hoax that was foisted on the pub- 
lic a half-century ago by an overzealous 
advertising agent (Pratkanis, 1992) to the 
vigorously debated use of subliminal mes- 
sages in psychotherapeutic settings (Silver- 
man, 1983; Weinberger, 1992). Admittedly, 
much of the early work was suspect and has 
been vigorously criticized (Eriksen, 1959; 
Holender, 1986; Shanks & St. John, 1994), 
and, as we noted earlier, there are numer 
ous methodological traps that await the un- 
wary. However, the impact of subliminally 
presented material on subsequent behavior 
has now been replicated in literally hundreds 
of experiments and the evidence appears 
to be convincing. Subliminal presentation 
can have an effect on emotional preferences 
(Kunst-Wilson & Zajonc, 1980; Murphy & 
Zajonc, 1993) and produce semantic prim- 
ing (Draine & Greenwald, 1998). Moreover, 
similarly undetectable stimuli have been 
shown to activate appropriate brain regions — 
emotionally charged stimuli activate the 
amygdala (Elliot & Dolan, 1998; Whalen 


tions produce parietal activity (Naccache & 
Dehaene, 2001). 

The studies with which we are most 
comfortable are those that follow Erdelyi’s 
(1986, 2004) advice cited previously. En- 
sure that there are separate and reliable mea- 
sures of accessible and available knowledge — 
with the critical inequality being those 
situations in which knowledge that is “ac- 
cessible” by consciousness is less than knowl- 
edge that is “available” and can be shown to 
have some (indirect or implicit) impact on 
behavior. (For more on the issues of measur- 
ing unconscious knowledge, see Merikle and 
Reingold, 1992.) Two classic studies that ap- 
pear to satisfy this condition (Marcel, 1983 
and Kunst-Wilson & Zajonc, 1980) are worth 
a closer look. 

In an extended series of experiments, 
Marcel (1983) showed that graphic (what 
the word looks like) and semantic (what 
the word means) information of subliminally 
presented words can affect choice behavior. 
One of Marcel’s standard protocols involved 
presenting subjects with two words, one sub- 
liminally and the other supraliminally. Af- 
ter each presentation, subjects were asked 
whether or not they saw a word. After the 
pair was presented, they were asked whether 
the two words were physically and seman- 
tically related. By systematically varying the 
sub/supra-liminality of the stimuli, Marcel 
was able to explore the manner in which 
implicitly and explicitly encoded stimuli af- 
fected subjects’ choices. The key finding was 
a threshold effect whereby subjects were at 
chance in determining the presence or ab- 
sence of the subliminally presented word but 
could reliably report whether the two words 
were semantically and graphically similar. 
Marcel’s conclusion, based on the full series 
of experiments, was that although there is a 
gradual effect of awareness on performance, 
importantly, a complete lack of awareness 
does not entirely remove that effect. 

In Kunst-Wilson and Zajonc’s (1980) clas- 
sic study, subjects were subliminally pre- 
sented with a set of irregular octagons. They 
were then shown pairs of octagons supral- 
iminally, one of which was from the set 
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new. Subjects were asked both to select 
the item they thought was presented before 
and to pick the one they preferred. Kunst- 
Wilson and Zajonc found that, despite being 
at chance on the recognition task, subjects 
showed a preference for the subliminally 
presented octagons over the novel ones, 
demonstrating that affective preferences can 
be influenced by events that were not con- 
sciously noticed (Kunst-Wilson & Zajonc, 
1980; Murphy and Zajonc, 1993). 

Elliot and Dolan (1998) extended this 
“subliminal mere exposure” effect and 
showed that, in addition to preferring the 
previously presented items, different brain 
regions were activated when old and novel 
stimuli were later presented supraliminally. 
This finding is consistent with a large num- 
ber of {MRI studies that suggest that implicit 
and explicit memory retrieval involves the 
activation of distinct brain regions (for a re- 
view see Cabeza & Nyberg (2000). Whalen 
et al. (1998) have also demonstrated that 
subliminally presented faces displaying fear- 
ful emotions activated the amygdala despite 
a lack of subjective awareness of ever hav- 
ing seen those faces. Happy faces presented 
for identical time periods had no effect on 
these structures. 

Finally, Naccache and Dehaene (2001) 
presented evidence of abstract representa- 
tion in subliminal priming. Subjects were 
asked to decide whether a target number 
was bigger or smaller than 5. Each target 
was preceded by a subliminal prime that was 
either spelled out (six) or presented in nu- 
meric form (6). Subjects displayed faster re- 
action times when the prime and target were 
the same number — regardless of the num- 
ber’s form. In addition, {MRI data revealed 
that the subliminal primes elicited the same 
parietal lobe activity as the supraliminal tar- 
gets, suggesting similar cortical processing. 

All of the studies above are open to the 
critique that some awareness of the stim- 
uli might have contaminated the procedure 
(Holender, 1986; Shanks & St. John, 1994). 
To counter this criticism, Debner and Jacoby 
(1994) used the process-dissociation proce- 
dure (Jacoby, 1991). In their study, words 
(e.g. MOTEL) were first presented sublimi- 


complete word stems such as MOT with the 
restriction that they not use any word they 
thought might have been used in the sub- 
liminal presentation phase. The logic here 
is clever. If the word was consciously per- 
ceived, the subjects should have been able to 
refrain from using that word to complete the 
word-stem. However if they did not see the 
word, then the subliminally presented word 
should have been used as often as the others. 
The results showed that subjects were typi- 
cally not able to follow this instruction and 
tended to use the subliminal primes. 

In summary, the suggestion that atten- 
tion and consciousness is needed for the en- 
coding of complex, semantically sensitive 
events (Perruchet & Vinter, 2003; Shanks & 
St. John, 1994) is probably unwarranted. 
These studies, although not uniform in their 
conclusions, suggest that fairly sophisticated 
information about complex stimulus dis- 
plays can be picked up under severe atten- 
tional load, when the material is presented 
subliminally and, possibly, under anesthe- 
sia. They also support the notion that this 
information is not simply logged in some 
inert form but has an impact on memo- 
rial representations, choice behavior, and 
decision making. 


Memory 


Virtually every complex living organism has 
the ability to store the products of expe- 
rience to be accessed at some later time. 
People’s ability to store a seemingly endless 
array of episodes, facts, motor skills, and lin- 
guistic and social knowledge and to retrieve 
the appropriate information rapidly and ap- 
propriately are remarkable phenomena — 
and one that still remains something of a 
mystery. Cognitive investigations of memory 
revealed early on that human memory is not 
a single, unified phenomenon. There are dif- 
ferent kinds of memories and each is instan- 
tiated in a variety of ways. Our concern here 
is the extent to which memory processes 
are modulated by conscious intentions or are 
implicit and operate outside the spotlight of 
consciousness. 
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tionally been studied using direct tests in 
which participants are asked to consciously 
recall or recognize previously memorized 
items. The original assumption was that a 
failure to recall or an inability to recognize 
an item is diagnostic of that item having been 
forgotten. However, as we noted earlier, just 
because people cannot recall something does 
not necessarily mean the memory no longer 
exists. In some ways, it is surprising that it 
took cognitive psychologists so long to ap- 
preciate this aspect of human memory. Early 
reports by neurologists such as Claparéde 
and Korsakoff implicated implicit represen- 
tational systems and, lest we forget, Freudian 
psychoanalysis was founded on the existence 
of nonretrievable memories that play a role 
in human behavior (Erdelyi, 1985). 

The renewed interest in implicit mem- 
ory was largely attributable to the discov- 
ery that amnesiac patients, despite being 
compromised in their ability to form new 
explicit knowledge, can nevertheless ac- 
quire new information implicitly. The laying 
down of consciously retrievable, long-term 
memories has been compellingly shown to 
be dependent on structures in the medial 
temporal lobes (MTL), specifically the hip- 
pocampus (Squire, 1995). When the hip- 
pocampus and its associated areas are dam- 
aged or destroyed, it becomes difficult and, 
in extreme cases impossible, for new ex- 
plicit memories to be formed. The dis- 
covery of the critical role the MTL struc- 
tures play here was made in the case of 
HM, the first neurological patient to have 
his hippocampus surgically removed (see 
Corkin, 1968; Milner, 1962; Milner, Corkin, 
& Teuber, 1968; Squire, 1992; Warrington 
& Weiskrantz, 1968). HM suffered from se- 
vere, intractable epilepsy, the neural focal 
point of which was in the MTL. To alleviate 
his multiple, daily seizures surgeons extir 
pated bilaterally the affected brain regions. 
Although the surgery was successful in stop- 
ping the seizures, HM emerged from the 
procedure with profound, chronic antero- 
grade amnesia. 

The standard interpretation of HM, 
based on the now rather large number 
of patients with similar neurological dam- 


that such people do not suffer from a 
learning deficit, per se, but rather from 
an inability to consolidate new explicit, 
or declarative, knowledge. Patients with 
MTL damage show no diminished abil- 
ity to recall episodes that occurred prior 
to the trauma, they present a nearly nor- 
mal short-term memory profile, and, impor- 
tantly from our perspective, they show rel- 
atively intact implicit learning and memory. 
Indeed, a large literature has accumulated in 
recent years showing that the performance 
of anterograde amnesiacs is virtually indis- 
tinguishable from that of normals on a wide 
variety of memory tasks including word- 
stem completion (Warrington & Weiskrantz, 
1968; 1974), fragment completion (Tulving, 
Hayman, & MacDonald, 1991), context sen- 
sitive memory (Schacter & Graf, 1986), 
memory for letter strings generated by an 
AG (Knowlton, Ramus, & Squire, 1992; 
Knowlton & Squire, 1994), and recall of 
dot patterns (Knowlton & Squire, 1994). 
As Seger (1994) argued, amnesiac patients 
provide the best empirical support for the 
proposition that knowledge that is not con- 
sciously accessible can still have a profound 
influence on ongoing behavior. 

These discoveries gave rise to a num- 
ber of significant advances in our under- 
standing of memory, both implicit and ex- 
plicit. Reber (1992a,b), Schacter (1987) and 
Squire (1992) all argued that the human 
memorial system can be fruitfully viewed as 
though there were two distinct information- 
processing systems — one declarative or 
explicit, the other procedural or implicit. 
The explicit system was theorized to in- 
clude declarative, conscious knowledge of 
episodes, facts, and events, whereas the im- 
plicit system was assumed to be operating 
largely outside of consciousness and to in- 
clude implicit learning and memory, condi- 
tioning, and learning various skills and habits 
(sensorimotor learning). Although this dis- 
tinction is probably a useful one in that it 
draws attention to the ways in which implicit 
and explicit functions can be dissociated, 
it is probably not the best stance to take 
from a functionalist point of view. As Reber 
(1993) argued, we need to be wary of falling 
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distinguishable systems that lie at the poles 
of a continuum as though they were onto- 
logically separate and distinct. It is almost 
certainly the case that virtually everything 
interesting that human beings entails a del- 
icate synergy between the implicit and the 
explicit, the conscious and declarative, and 
the unconscious and procedural. If the im- 
plicit and the explicit systems ultimately are 
shown to be based on neuroanatomically dis- 
tinct structures (as we suspect will be done), 
it will still be virtually impossible to find 
functionally pure instantiations of them. 

In addition, Reber (1992a,b) argued that 
because human consciousness and its ac- 
companying functions are late arrivals on the 
evolutionary scene, there should be partic- 
ular patterns of dissociation between these 
two systems. The key predictions of the 
model for this discussion are: 


(a) Storage and retrieval systems that serve 
the implicit system should be more ro- 
bust and relatively undisturbed by in- 
sult and injury that compromise explicit 
functions. 


(b) There should be relatively little in the 
way of developmental and life-span 
changes in implicit compared with ex- 
plicit functions. This two-system model 
has garnered significant support over the 
past decade (see Reber, Allen, & Reber, 
1999 and Squire & Knowlton, 2000 for 
reviews). 


Taken together, this literature paints a 
clear picture. Human memory has distinct 
systems with distinct evolutionary histories 
and separate, although only partly under- 
stood, neurological underpinnings that map, 
on one hand, into conscious, subjective ex- 
perience and, on the other, into a nexus of 
encoding, storage, and retrieval systems that 
function largely independently of awareness 
and consciousness. However, this picture is 
still incomplete, and appreciating the man- 
ner in which it operates in complex human 
thinking requires a deeper look at the topic 
of learning — specifically implicit learning in 
which knowledge about the complexities of 
the environment is acquired without benefit 
of consciously controlled processes. 


Implicit learning is the process whereby or- 
ganisms acquire knowledge about the reg- 
ularities of complex environments without 
intending to do so and largely independently 
of conscious awareness of the nature of what 
was learned (Stadler & Frensch, 1998; Reber, 
1967; Reber, 1993). The complex environ- 
ments include virtually every facet of human 
life, including language learning, trait knowl- 
edge, categorization, acculturation, and the 
development of aesthetic preferences. The 
claim we are making is that people extract 
information about the world more often 
than they are aware and that this knowledge 
exists in a tacit form, influencing thought 
and behavior while itself remaining mostly 
concealed from conscious awareness. 


IMPLICIT LEARNING IN INFANTS 


By the second month of life, infants can al- 
ready distinguish between utterances spo- 
ken in their native language and those spo- 
ken in foreign languages. Infants can do this 
although they don’t understand what the 
sentences mean in either language. Interest- 
ingly, this effect disappears when the sen- 
tences are presented backwards (Dehaene- 
Lambertz, & Houston, 1998; Mehler et al., 
1988; Ramus et al., 1999). The implica- 
tions here are that, despite not understand- 
ing the sentences backwards or forwards, 
the infants have become attuned to the 
natural flow of language. This natural flow 
is violated when the sentence is reversed. 
This sensitivity to the structure of linguis- 
tic sounds which seems to be the first stage 
of language acquisition, takes place implic- 
itly and recruits brain regions similar to 
those of adults as shown by fMRI stud- 
ies (Dehaene-Lambertz, Dehaene, & Hertz- 
Pannier, 2002). 

Within a surprisingly short time, infants 
extract the phonetic regularities of their 
linguistic surroundings and can differenti- 
ate between sound sequences that are well 
formed and those that are not. The back- 
wards sentences sound ill-formed to the in- 
fants because their sequential structure is 
discoordinate with the infant’s experience 
and therefore seem as ill-formed as sentences 
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this to the standard implicit learning proce- 
dure in Artificial Grammar studies discussed 
below. 

These kinds of effects are not restricted 
to natural languages. Rovee-Collier and her 
colleagues (see Rovee-Collier, 1990 for a re- 
view) report that infants rapidly pick up the 
relationships between their own motor ac- 
tions and the impact that they have on the 
external world. Haith, Wentworth, and Can- 
field (1993) showed that babies make antic- 
ipatory eye movements to regularities in the 
spatial patterns of visual displays. Saffran and 
her colleagues reported that infants as young 
as 8 months show a similar sensitivity to the 
arbitrary statistical nature of auditory pat- 
terns and can learn the rules governing arti- 
ficial word segmentation (Saffran, & Aslin, 
& Newport, 1996). Interestingly, in Saffran’s 
studies, the infants performed as well as a 
group of adults, a result that supports Re- 
ber’s (1992b) prediction that implicit learn- 
ing systems are present at a very early age and 
undergo little developmental change. Simi- 
larly, Gomez and Gerken (i999, 2000), us- 
ing the AG learning procedure, showed that 
not only do one-year-olds learn the struc- 
tural characteristics of these rather complex 
systems, they also transfer this knowledge to 
novel stimulus domains. 

To date, this research has been restricted 
largely to sensorimotor, perceptual, and cog- 
nitive tasks. Surprisingly, little empirical 
work has been carried out on behaviors that 
are more reflective of social learning. How- 
ever, given the existing database, we suspect 
that when processes of socialization are ex- 
amined from this perspective, they will re- 
veal a parallel set of operations in which in- 
fants gradually become inculcated with the 
social mores and ethical codes of their cul- 
ture without conscious awareness of what 
has been learned and with little in the way 
of top-down control of the process. 


IMPLICIT LEARNING IN ADULTS 


In recent years, a rather impressive array of 
specific tasks have been discovered to have 
dissociative elements in that either direct 
and indirect tests distinguish between im- 
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ious patient populations manifest distinct 
patterns of loss of explicit acquisitional func- 
tions while maintaining those based on the 
implicit processes. Included here are stud- 
ies on motor learning (P.J. Reber & Squire, 
1998), AG learning (Knowlton & Squire, 
1994, 1996; Reber, 1967, 1989), category 
learning (Knowlton & Squire, 1993; Squire 
& Knowlton, 1996), Pavlovian conditioning 
(Daum & Ackerman, 1994; Gabrieli et al., 
1995), decision making in social settings 
(Lewicki, 1986a; any of several contribu- 
tions to Uleman & Bargh, 1989), the sequen- 
tial reaction time task (see Hsiao & Reber, 
1998 for a review), the hidden covariation 
task (Lewicki, 1986b), preference formation 
(Gordon & Holyoak, 1983; Manza, Zizak, 
& Reber, 1998), the production control task 
(Berry & Broadbent, 1988), and dot pattern 
classification (P.J. Reber, Stark, & Squire, 
1998). The various chapters in Stadler and 
Frensch’s (1998) edited volume Handbook 
of Implicit Learning are a good resource for a 
more detailed discussion. 

These many reports are supplemented 
by additional findings that show that pa- 
tients with damage to primary visual cor 
tex learn to respond to objects in their blind 
fields (Weiskrantz, 1986), prosopagnosiacs 
who cannot consciously recognize the faces 
of family members show virtually normal 
implicit facial memory (De Haan, Young, & 
Newcombe, 1991), patients with neglect re- 
spond to the meaning of stimuli that they are 
unaware of processing (Berti & Rizzolatti, 
1992), amnesic patients show improvement 
in solving problems (Winter et al., 2001) 
and learn to operate complex equipment 
(Glisky, Schacter, & Tulving, 1986) despite 
no conscious memory of the earlier training 
phases of the studies. Issues of the mech- 
anisms underlying disordered thought are 
pursued in detail elsewhere in this volume 
by Bachman and Cannon (see Chap. 21). 

The model that has emerged from this lit- 
erature characterizes implicit learning as a 
mechanism the primary function of which 
is to monitor the environment for reli- 
able relationships between events and to 
encode those patterns of covariation. In 
all likelihood, the underlying neurological 
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linked to the modality of input of the stimu- 
lus display (Ungerleider, 1995). The under 
lying representations that are established are 
probably not as flexible or abstract as those 
that are under conscious control simply be- 
cause the top-down modulation that comes 
with consciousness allows for deliberative 
shifts in representation and use of knowl- 
edge. However, this issue is a highly con- 
tentious one and we have more to say on 
it subsequently. 

In addition, implicit acquisitional mech- 
anisms appear early in life, well before con- 
scious awareness has developed. They show 
relatively little change over the life span 
compared with explicit cognitive functions 
(Howard & Howard, 1992, 1997) and rel- 
atively little in the way of individual-to- 
individual variation (Reber & Allen, 2000). 
As noted previously in several places, the 
implicit system demonstrates a rather re- 
markable robustness and continues to func- 
tion effectively in the face of a wide vari- 
ety of neurological and psychiatric disorders 
that severely compromise functions based 
on explicit, consciously modulated mecha- 
nisms. It seems clear that the implicit sys- 
tem is the critical mental component that 
enables the infant and child to learn to nav- 
igate the world. Virtually all the essential 
knowledge of the perceptual, sensorimotor, 
linguistic and social patterns, that make up 
the environment and eventually become the 
epistemic foundations of adulthood is ac- 
quired through this nondeclarative, proce- 
dural mechanism. This is, indeed, how we 
learn about the world around us. For further 
explorations of this and related developmen- 
tal mechanisms, see Halford (this volume). 

Although these aspects of implicit 
thought are fairly well established, there are 
two issues that remain deeply problemati- 
cal and need to be addressed: First, are (or 
better, perhaps, can) these implicitly formed 
representations be regarded as abstract? Sec- 
ond, what role might they play in complex 
cognitive processes such as problem solving 
that have been generally regarded as largely, 
if not completely, explicit and under con- 
scious control? 


One possibility might be that these un- 
conscious, perceptual, and motoric repre- 
sentations are not themselves anything like 
conscious thoughts in terms of their under- 
lying form. Thinking consciously about the 
world involves forming abstract mental “pic- 
tures.” We can be thinking about a tree and 
not necessarily be looking at or remember 
ing any specific tree. We can know an abstract 
rule suchas “if A > BandB > C, then A > C” 
that can be applied to any set of objects 
that can be ranked along a single dimension 
like height or weight. This kind of abstract 
memorial code feels very natural to us. We 
freely think about legal decisions (guilt or in- 
nocence), geometry (all plane triangles have 
180 degrees), artistic expressions, drama, po- 
etry, aesthetics, politics, and so on. When we 
do, we have an ineffable sense of manipu- 
lating abstract and flexible representations, 
ones that feel loose and unconstrained by 
particular settings or features. 

Our personal, introspective experiences 
with these daily activities are so compelling 
that, historically, consciousness was often 
viewed as though it were the defining fea- 
ture of human thought. The philosophical 
traditions that have had the strongest in- 
fluence on psychology are those of Locke 
and Descartes, and while these two didn’t 
agree on much, the one proposition they 
shared was that cognitive states are transpar- 
ent to introspection. If it’s cognitive, it’s con- 
scious — and by cognitive states they meant 
those that are semantic, flexible in function 
and representationally abstract. In fact, the 
notion that there is anything truly cogni- 
tive about any unconscious process — that 
an implicit mechanism could result in ab- 
stract mental representations is, according to 
this perspective, self-contradictory nonsense 
(Dennett, 1957). 

In the next two sections we explore re- 
lated questions like: Is unconscious, tacit 
knowledge in any way like conscious knowl- 
edge in its complexity? Are implicit rep- 
resentations flexible? Can they be charac- 
terized as abstract? Do they play a role in 
the computations of problem solving? Or 
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level perceptual processes — rigid, inflexible, 
and concrete and playing virtually no role in 
higher-level functions such as problem solv- 
ing and creativity? 


Perceptual and Conceptual 
Representations 


Perceptual representation involves capturing 
surface features of objects without necessar- 
ily understanding what the objects are. A 
picture taken by a computer scanner is an ex- 
ample of a perceptual but not a conceptual 
representation. A scanner can take a picture 
of an object and store it in memory, while 
having no semantics and not representing 
anything meaningfully. The meaningful rep- 
resentation of objects involves, among other 
things, the ability to categorize and form 
mental representations of the categories ab- 
stractly. It has been proposed by a variety 
of researchers that many of the phenom- 
ena discussed in this chapter so far such as 
priming, lexical decision making, word frag- 
ment completion, artificial grammar learn- 
ing, and dot pattern classification are tapping 
perceptual — not conceptual — processes 
(e.g., Perruchet & Vinter, 1998, 2003; Shanks 
& St. John, 1994). From this perspective, the 
unconscious acts as a purely perceptual sys- 
tem capable, in some ways like a scanner or 
a camera, of capturing the perceptual or au- 
ditory properties of the world. The uncon- 
scious, according to this perspective, is not 
particularly smart, and does not contain any 
real representations at least not those that 
are “about” something. 

Exploring these considerations has be- 
come a virtual cottage industry. Toth and 
Reingold (1996) present an overview of the 
work using priming, and Kirsner (1998) 
provides a review of the implicit memory 
literature. Both suggest that, although the is- 
sues are complex, implicitly encoded mate- 
rial shows both abstract and instance-based 
representations. Here we review a topic that 
focuses directly on the issue, transfer in AG 
learning. Unlike the study of implicit mem- 
ory using priming or stem completion in 
which the stimulus materials tend to be 


ments use novel, arbitrary stimulus displays, 
affording the opportunity to examine the 
representational form of knowledge that was 
acquired in a controlled setting. 

The original claim (Reber, 1967, 1969) 
was that the representations established 
while memorizing exemplars from an AG 
like that shown in Figure 18.1 are based on 
the rules of the grammar and, hence, are ab- 
stract and independent of the surface fea- 
tures of the stimuli. This claim did not go 
uncontested. Brooks and Vokey (1991), Du- 
lany, Carlson and Dewey (1984), Perruchet 
and Pacteau (1991), and Shanks and St. John 
(1994) all argued that subjects’ performance 
in these experiments is also consistent with 
representations based on the micro compo- 
nents of the exemplar strings. That is, a well- 
formed sequence like PTVPS is not necessar- 
ily represented as an instance of a complex 
rule but may be captured by a concrete in- 
stantiation. Some (e.g., Perruchet & Pacteau, 
1994; Servan-Schreiber & Anderson, 1990) 
argued for an encoding based on small 
chunks like bi- and trigrams (PT, TV, VPS). 
Others (Brooks & Vokey, 1991) argued for 
a more holistic instantiation of the spe- 
cific stimulus input, but eschew the possi- 
bility that the implicit memorial forms are 
abstract. 

The key studies that speak to this issue 
are those that use a transfer protocol. That 
is, subjects learn an AG instantiated in one 
symbol set but are switched to stimuli made 
up using a different symbol set at some point 
in the experiment. The argument is that, if 
subjects’ implicit memorial forms, are based 
on concrete representations then transfer to 
a novel letter set should seriously compro- 
mise their ability to function. If the repre- 
sentations are abstract in nature, the subjects 
should be comfortable with the transfer con- 
dition. 

In the first of these studies Reber (1969) 
asked subjects to memorize letter strings 
from an AG over 12 trial blocks. After the 
sixth block, either the letters used to instan- 
tiate the AG were changed, or the AG it- 
self was changed. Switching letter sets was 
surprisingly benign. So long as the rules 
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subjects were able to work with novel letter 
sets with little difficulty. However, changing 
the rules for letter order disrupted subjects’ 
ability to encode and store the materials. 

This study was followed up by a par 
allel series of experiments in which sub- 
jects memorized letter strings from an AG 
and then had to judge how well novel 
strings instantiated using new letters had 
been formed. Subjects learn about the un- 
derlying regularities of an AG by memo- 
rizing strings like TSSVVPS but then have 
to judge the grammaticality of strings like 
BXXMMRxX. Thus, the surface features of 
the stimuli differ from learning to testing, 
but the deep structure remains the same. Us- 
ing this technique, numerous studies have 
found successful transfer (Altmann, Dienes, 
& Goode, 1995; Brooks & Vokey, 1991; 
Gomez & Schvaneveldt, 1994; Knowlton & 
Squire, 1996; Manza & Reber, 1997; Math- 
ews et al., 1989; Shanks, Johnstone, & Staggs, 
1997; Vokey & Brooks, 1992; Whittlesea & 
Dorken, 1993). 

Although it is generally agreed that the 
transfer effect is real (Perruchet & Vinter, 
2003; Redington & Chater, 1998), there 
is still no consensus on interpretation. Al- 
though the effect would seem to implicate 
an abstract representational form, Brooks 
and Vokey (i991; Vokey & Brooks, 1992) 
have pointed out that transfer could also 
be a product of the physical similarity be- 
tween the grammatical strings and the trans- 
formed test strings. For example, what 
makes the two sequences given above 
“similar” is that they both consist of seven 
letters, they both contain two repeats af- 
ter the first letter, and they end with one 
of those repeating letters. They called this a 
“relational” or “abstract analogy” for the se- 
quences. According to this view, subjects are 
not learning the deep structure of the gram- 
mar that can be applied to any domain; they 
have learned a specific set of facts about in- 
dividual exemplars. 

Brooks and Vokey tested their theory by 
controlling for the physical similarity and the 
grammaticality of the test items and found 
evidence for both forms of encoding. That is, 
about half the explainable variance in sub- 


an underlying abstract representation based 
upon the rules of the AG and about half was 
shown to be dependent on abstract analog- 
ical representations that were linked to the 
physical forms of the input stimuli. 

Of course, these studies still leave open 
the question of the actual memorial form 
of the representations. As noted previously, 
representations could be based not on whole 
items but on more molecular “chunks.” 
There is evidence both in favor of and 
against this chunking interpretation. On one 
hand, Knowlton and Squire (1994) repli- 
cated Brooks and Vokey’s findings, but when 
they controlled for “chunk strength” (stim- 
uli had equal numbers of common bigrams) 
they found little effect of overall similarity. 
On the other, simply encoding the chunking 
characteristics of an AG cannot be all that 
is learned in these experiments. “Chunk- 
trained” subjects who never learn full strings 
perform reasonably well on the grammat- 
icality task (Perruchet & Pacteau, 1990), 
but they do not show transfer (Manza & 
Reber, 1997). 

What seems to be emerging from this line 
of research is that there is no “default” rep- 
resentational form (Whittlesea & Dorken, 
1993). Rather, representational form is dic- 
tated by context effects and task demands. 
Manza and Reber (1997) included a con- 
dition that supports this functionalist posi- 
tion. One group memorized letter strings in- 
stantiated in one letter set. Another group 
learned structurally identical strings, but half 
of them were instantiated using a second let- 
ter set. Both groups were tested using strings 
made up of both old and new letter sets. 
The second group showed better transfer on 
the items instantiated with a novel letter set. 
Training with the two distinct instantiations 
encouraged a more abstract representational 
form that assisted the subjects when they 
confronted yet another surface form. 

Finally, several researchers have demon- 
strated another hallmark of abstract rep- 
resentation — cross-modality transfer. Alt- 
mann, Dienes, and Goode (1995), Howard 
and Ballas (1982), and Manza and Reber 
(1997) have all shown that subjects can learn 
visual sequences and make judgments about 
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and vice versa. Taken together these find- 
ings suggest that abstraction is an important 
factor in AG learning. Whether or not this 
conclusion ultimately applies to all forms of 
tacit knowledge is yet to be determined. Our 
best guess is that virtually all forms of im- 
plicit learning will yield some memorial rep- 
resentations that are abstract but that Whit- 
tlesea and Dorken’s message is likely correct. 
The degree to which the underlying memo- 
rial code is abstract or concrete and what its 
detailed form will look like is going to have 
a good deal to do with the processing con- 
straints placed on individuals in particular 
settings. The manner in which representa- 
tions get established is critical in determin- 
ing which relational generalizations can be 
formed from multiple examples (Holyoak, 


Chap. 6). 


Creativity and Problem Solving 


We began our exploration of implicit, un- 
conscious process in human thought with 
the “simpler” functions of perception and 
memory. We'll end with a quick look at the 
more complex topics of problem solving and 
creativity (see Novick & Bassok, Chap. 14; 
Sternberg et al., Chap. 15). Although there 
hasn’t been much recent study of uncon- 
scious influence on these functions, the no- 
tion that tacit knowledge affects the creative 
process was a central theme in the Gestalt 
approach (Kéhler, 1925; Wertheimer, 1945), 
which assumed three main elements of un- 
conscious thought: intuition, or the feeling 
of directionality of the unconscious process; 
incubation, or the tacit processing of in- 
formation that eventually leads to problem 
solving; and insight, the “aha” experience in 
which the implicit processes become con- 
scious (Kihlstrom 1999; Dorfman, Shames, 
& Kihlstrom, 1996). 

In the now classic test of this model, 
Maier (1931) asked subjects to tie together 
two strings hanging from the ceiling. The 
strings were too far apart to grab one string 
while still holding on to the other. One 
solution was to tie a small object, strate- 


strings and to swing it like a pendulum. 
Maier found that subjects were much more 
likely to solve this problem after the ex- 
perimenter casually brushed against one of 
the strings, producing the swinging motion. 
Interestingly, Maier’s subjects did not re- 
port having consciously noticed the manip- 
ulation. Judson, Cofer, and Gelfand (1956) 
found similar facilitation if subjects mem- 
orized word lists that contained items re- 
lated to the problem’s solution such as swing, 
string or pendulum. Recently, Knoblich and 
Wartenberg (1998) reported a similar ef- 
fect when the priming words were presented 
subliminally. 

Most modern approaches to these issues 
invoke the notion of spreading activation — an 
important theoretical mechanism in many 
contemporary models of human cognition. 
The notion is that experience registers in 
specific cortical areas and “spreads” to other 
“nodes” that are associatively linked with 
the input. Meyer and Schvaneveldt (1971) 
showed that the encoding process influences 
the subsequent processing of related words 
on indirect tests such as the lexical decision 
task. For example, subjects presented with 
words like “bread” respond more rapidly to 
“butter” than to “nurse” — although the re- 
verse applies if the prime is “doctor.” The 
argument is that the initial prime initiates 
a spread of activation and related repre- 
sentations in the semantic network are af- 
fected. The question that interests us here is 
whether such a process can take place un- 
consciously. Are processes like intuition, in- 
sight, and creativity facilitated by the activity 
engendered in semantically related but tac- 
itly represented memories that are not part 
of our conscious experience? 

Yaniv and Meyer (1987) examined this 
possibility by looking at the influence of in- 
accessible material on reaction time in a lex- 
ical decision task. Subjects were first read 
definitions of rare words and asked to pro- 
vide the word and, when they could not, to 
rate their feeling of knowing the word. In 
the subsequent lexical decision task, subjects 
showed faster reaction times for words that 
they could not recall than for control words 
and, interestingly, the “feeling of knowing” 
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A kind of metacognition appears to be op- 
erating in this situation in which subjects 
are sensitive to the contents of tacit knowl- 
edge even though the actual material is not 
available for conscious recall. In an extension 
of this idea, Shames, Kihlstrom, and Forster 
(1994) presented subjects with a list of three 
words such as: “goat,” ’pass,” and “green” and 
asked them to generate an associate that all 
three have in common. They found that in 
cases in which subjects could not provide 
the correct answer to the triad (“mountain”), 
the reaction time to the correct answer was 
faster on a lexical decision task than an unre- 
lated word. These experiments, along with 
the Gestalt problem-solving studies, suggest 
that the initial experience of trying to solve 
a problem, retrieve a rare word, or find the 
common element in a word-triad seems to 
set in motion a spread of activation pro- 
cess which, intriguingly, is engaged effec- 
tively even with knowledge that is tacit and 
unavailable for conscious recall. 

Additional recent evidence of uncon- 
scious influences in problem solving is seen 
in complex problems that do not require 
top-down control such as the balls and boxes 
puzzle (P. J. Reber & Kotovsky, 1997). In 
these studies subjects sit in front of a com- 
puter screen displaying five boxes, each of 
which is associated with one of five balls. Ini- 
tially all the balls sit outside the boxes and 
the goal is to place all the balls inside the 
boxes. The rule for moving balls in or out 
of boxes is as follows: “The rightmost ball 
can always move; other balls can be moved 
if the ball immediately to the right is in its 
box and all other balls to the right are out 
of their boxes.” The results showed that par- 
ticipants frequently solved the puzzle while 
being unaware of the rule system that gov- 
erned it. The following is a telling conversa- 
tion between the experimenter and one of 
the participants immediately following the 
first completion of the puzzle. 


Experimenter: Now I want to ask you 
about the puzzle you just solved, how 
it worked, what you did. 

Participant: No idea 

E: No idea? 


E: You did get it, right? 

P: Yeah, but it was basically luck... that I 
got it. 

E: You had no idea what you were doing? 

P: Not really. 


E: Suppose somebody else was going to 
do the puzzle who had never seen it 
before and you had to give them some 
hints, tell them how to solve it. 


P: Well, let’s see. I don’t know what 
to say... but, I guess (garbled) the 
puzzle ...the good part was that there 
usually wasn’t any more than like one 
or two choices. I think there was 
one choice was there any more than 
one choice? I don’t know. But I had 
(garbled). Which is why I kept ending 
up back where I started from, which 
was frustrating. I would tell them, I 
would tell them, good luck. That’s all. 


In spite of being unaware of the rules, this 
participant’s overall performance was quite 
good. In fact, as Reber and Kotovsky re- 
ported, “Immediately after giving this fairly 
uninformative description of his process of 
solving the puzzle he...solved the puzzle 
in 21 moves (the minimum). ...superior to 
most of the other participants.” 

Recent experiments provide similar find- 
ings with regard to creativity. Marsh, Bink, 
and Hicks (1999) demonstrated the pos- 
sible influence of previously encountered 
events, which are not necessarily consciously 
remembered, on creative expression and 
thought. Participants were asked to spend 
a period of twenty minutes drawing space 
creatures from their imagination. Before be- 
ginning they were shown three pictures 
of fictitious space creatures presented as 
examples of other participants’ drawings. 
Each of these creatures had fangs, spikes, 
or weapons, all objects which are oriented 
around one theme — hostility. Participants 
were then asked to draw any type of crea- 
ture that they wanted as long as they did 
not copy any aspect of the creatures shown. 
The results were intriguing and reminiscent 
of the work of Jacoby and his colleagues with 
the process-dissociation procedure discussed 
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told not to include any of the exemplar 
characteristics, the core concept around 
which these characteristics revolved, hos- 
tility, could be seen in most of their 
creative work. 

Importantly, little influence of the actual 
characteristics of the exemplars could be 
seen in the creative works of the participants. 
The elements that made the exemplar crea- 
tures hostile (fangs, spikes, and weapons) 
were virtually never depicted by the partic- 
ipants in their novel drawings. Rather it was 
the underlying theme, the shared qualities of 
the exemplars that were influencing the par- 
ticipants’ drawings. In a post-experimental 
interview, only 4 of the 142 participants de- 
scribed the original three samples as dis- 
playing hostility. These effects were not lim- 
ited to visual displays. Subjects who initially 
worked with scrambled sentences exhibiting 
a mild hostility-related theme produced sim- 
ilar data. These results are consistent with 
the spreading activation perspective in that 
the creative process is facilitated by previ- 
ously encountered, and unconsciously de- 
tected, themes in one’s environment. 

Taken together, these studies suggest that 
complex processes such as problem solv- 
ing or creative invention can be influenced 
by previously encountered experiences that 
are not, at the critical time of the task, 
consciously available. There are relatively 
few studies that have looked directly at 
this issue and, of course, we are not sug- 
gesting that these experiments are process 
pure. It is possible that subjects in these 
situations, to some extent, have been con- 
sciously aware of the previously provided 
material (see Jacoby, Lindsay, & Toth, 1992). 
Nevertheless, the work is provocative and 
is coordinate with the converging lines of 
evidence cited previously. See Sternberg 
(Chap. 31) for additional approaches to the 
issue of creativity. 


Conclusions and Future Directions 


Unhappily, we don’t feel as though we have 
presented more than a dollop of the liter- 
ature. We never got to discussing work on 


nitive factors in sensorimotor skills (Weiss, 
Reber, & Owen, in review), various formal 
models of implicit learning (Cleeremans, 
1993; Keele, Ivry, et al., 2003), the role that 
implicit processes play in aesthetics (Zizak 
& Reber, 2004), social intuition (Lieber- 
man, 2000), moral judgment (Haidt, 2001), 
creativity (Polanyi, 1958; Reber, 1993), the 
time course of memory consolidation and 
sleep (Litman & Reber, 2002), the patterns 
of lost and preserved functions in a vari- 
ety of developmental disorders (Don et al., 
2003; Smith, 2003), the issue of life-span 
changes and individual differences (Reber & 
Allen, 2000), implicit acquisition of fear and 
other emotions (Phelps, 2004) and the influ- 
ence of unconscious thought on psycholog- 
ical well-being (Pyszczynski, Greenberg, & 
Solomon, 1999). 

Over the past several decades, it has be- 
come increasingly clear that implicit pro- 
cesses, those that operate largely outside of 
the spotlight of consciousness, play a signif- 
icant role in most of the interesting things 
that human beings do. We can only hope that 
the coming decades will produce a better un- 
derstanding of these mechanisms, their un- 
derlying cortical pathways, and the manner 
in which they are integrated into the com- 
plex synergistic interplay of the top-down 
and the bottom-up that makes up human 
cognitive functioning. 
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CHAPTER 19 


Thinking in Working Memory 


Robert G. Morrison 


Introduction 


It is not an accident that this discussion of 
working memory is positioned near the cen- 
ter of a volume on thinking and reasoning. 
Central to higher-level cognitive processes 
is the ability to form and manipulate men- 
tal representations (see Doumas & Hummel, 
Chap. 4). Working memory is the cogni- 
tive construct responsible for the mainte- 
nance and manipulation of information and 
therefore is neccessary for many of the 
types of complex thought described in this 
book. Likewise, the development and fail- 
ures of working memory are critical to un- 
derstanding thought changes with develop- 
ment (see Halford, Chap. 22) and aging (see 
Salthouse, Chap. 24) as well as many types 
of higher-level cognitive impairments (see 
Bachman & Cannon, Chap. 21). In spite of 
its obvious importance for thinking and rea- 
soning, working memory’s role in complex 
thought is just beginning to be understood. 
In this chapter, we review several dominant 
models of working memory, viewing them 
from different methodological perspectives, 


including dual-task experiments, individual 
differences, and cognitive neuroscience. 


Multiple Memory Systems? 


Although the idea of separate primary mem- 
ory is credited to William James (1890), 
Waugh and Norman (1965) and Atkinson 
and Shiffrin (1968) developed the idea 
of distinct primary (i.e, short-term) and 
secondary (i.e., long-term) memory com- 
ponents into defined models of the hu- 
man memory system. These multicompo- 
nent models of memory were supported 
by observations from many different stud- 
ies during the 1950s and 1960s. Perhaps the 
most familiar justification for separate short- 
term and long-term memory systems is the 
serial position effect (e.g., Murdock, 1962). 
During list learning, the most recently stud- 
ied items show an advantage when tested 
immediately — an advantage that goes away 
quickly with a delay in test provided that 
participants are prevented from rehearsing. 
This recency effect is presumably the result 
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Figure 19.1. Atkinson and Shiffrin’s (1968) multicomponent memory model. 


of quickly unloading short-term memory at 
test. In contrast, the first items in the list 
show an advantage that withstands a delay 
period. This primacy effect presumably oc- 
curs because these initial items have been 
stored in long-term memory through prac- 
tice. Conrad (1964) provided another im- 
portant finding justifying distinct systems 
when he observed that errors in short- 
term remembering were usually phonolog- 
ical whereas long-term memory was dom- 
inated by semantic coding. This suggested 
that rehearsal or storage systems were dif- 
ferent between the two types of memory. 
Yet another important finding was that, al- 
though the capacity of long-term memory 
was seemingly limitless, short-term mem- 
ory as observed in a simple digit-span task 
was of limited capacity (Miller, 1956) - 
a finding confirmed using many other ex- 
perimental paradigms. Lastly, around this 
same era, neuropsychological evidence be- 
gan to emerge suggesting that at least parts 
of the short- and long-term memory systems 
were anatomically distinct. Milner’s (1966) 
famous amnesic patient, HM, with his long- 
term memory deficits but preserved short- 
term digit span, and Shallice and Warring- 
ton’s (1970) patient, KF, with his intact 
long-term memory but grossly impaired 
digit span, presented a double dissociation 
favoring at least partially distinct short- and 
long-term memory systems. Atkinson and 
Shiffrin’s (1968) memory model was typical 
of models from the late 1960s with distinct 
sensory, short-term, and long-term memory 
stores (Figure 19.1). Short-term memory was 
viewed as a short-term buffer for informa- 
tion that was maintained by active rehearsal. 
It was also believed to be the mechanism by 


which information was stored in long-term 
memory. 


A Multi-component Working 
Memory Model 


While exploring the issues described in the 
previous section, Baddeley and Hitch (1974) 
proposed a model that expanded short-term 
memory into the modern concept of working 
memory — aterm that has been used in several 
different contexts in psychology.’ Baddeley 
(1986) defined working memory as “a system 
for the temporary holding and manipula- 
tion of information during the performance 
of a range of cognitive tasks such as com- 
prehension, learning, and reasoning” (Ref. 3, 
p. 34). Ina recent description of his working- 
memory model, Baddeley (2000) proposed 
a four-component model (Figure 19.2), in- 
cluding the phonological loop, the visuospa- 
tial sketchpad, the central executive, and the 
model’s most recent addition, the episodic 


Central Executive 


Visuospatial Episodic Phonological 
Sketchpad Buffer Loop 
” Episodic 
si binen Long-Term “— Language 
Semantics 
Memory 


Figure 19.2. Baddeley’s (2000) four-component 
working memory model. 
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ceptualized based on results from behav- 
ioral dual-task paradigms and neuropsychol- 
ogy. For instance, using behavioral methods, 
Baddeley and Hitch (1974) reasoned that 
they could identify the separable elements 
of working memory by looking for task in- 
terference. If you assume the various compo- 
nents of working memory are capacity lim- 
ited, then if the simultaneous performance 
of a secondary task degrades performance 
of a primary task, these two tasks must 
tap a common limited resource — partic- 
ularly if there exists another primary task 
that is unaffected by performance of the sec- 
ondary task and is affected by a different 
secondary task that does not affect the first 
primary task. Likewise, neuropsychological 
evidence such as the existence of patients 
with selectively disabled verbal (e.g., patient 
KF, Shallice & Warrington, 1970) and visual 
(e.g., de Renzi & Nichelli, 1975) digit span 
suggested that verbal and visual working- 
memory systems are somewhat separable 
as well. 

Using this type of methodology, Baddeley 
has suggested that the phonological loop and 
visuospatial sketchpad are modality-specific 
slave systems that are responsible for main- 
taining information over short periods of 
time. The phonological loop is responsible 
for the maintenance and rehearsal of infor 
mation that can be coded verbally (e.g., the 
digits in a digit-span task). It is phonemi- 
cally sensitive (e.g., Ted and Fred are harder 
to remember than Ted and Bob), and its ca- 
pacity is approximately equal to the amount 
of information that can be subvocally cy- 
cled in approximately 2 seconds. Baddeley 
(1986) argues that these two characteris- 
tics of verbal working memory are best ex- 
plained by two components: (1) a phonologi- 
cal store that holds all of the information that 
is currently active and is sensitive to 
phonemic interference effects and (2) an 
articulatory loop that is used to refresh the 
information via a process of time-limited 
subvocal cycling. The articulatory loop 
is specifically disrupted by the common 
phonological loop secondary task, articula- 
tory suppression (i.e., repeating a word or 


no no 


no no 


* 
yes yes yes yes 


Figure 19.3. The Brooks (1968) letter task. 
Participants are to image a block letter and then 
decide whether each corner of the letter is an 
outside edge. 


number vocally). Thus, verbal span is con- 
strained by both the amount of information 
to be maintained and the time that it takes 
to rehearse it. In contrast to the phonologi- 
cal loop, the visuospatial sketchpad has been 
more difficult to describe. In a dual-task ex- 
periment, Baddeley (1986) asked subjects to 
simultaneously perform a pursuit rotor task 
(i.e, track a spot of light that followed a 
circular path with a stylus) while perform- 
ing either a verbal or spatial memory task 
previously developed by Brooks (1968; Fig- 
ure 19.3). The verbal task required subjects 
to remember a sentence (e.g., “A bird in hand 
is not in the bush”) and scan through each 
word deciding whether it was a noun or not. 
The correct pattern of output for this exam- 
ple would be: No, YEs, NO, YES, NO, NO, NO, NO, 
yes. In the visual memory task, participants 
are first shown a block letter with one corner 
marked with an asterisk (Figure 19.3). They 
are then asked to imagine the letter and, be- 
ginning at the marked corner, judge whether 
each corner is an outside corner or not. Thus, 
in both the verbal and visual memory tasks, 
participants are required to hold a modality- 
specific object in memory and inspect it, an- 
swering yes or no to questions about their 
inspection. Baddeley found that the visual 
memory task, but not the verbal memory 
task, seriously degraded pursuit rotor track- 
ing performance. 

Logie (1995) has argued for a visual sim- 
ilarity effect analogous to the phonemic 
similarity effect used to support the phono- 
logical store. Participants were visually 
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letters (e.g., “KcPs” or “gBrQ”). Letters were 
chosen based on the similarity of their lower 
and uppercase characters. Thus Kk, Cc, Pp, 
Ss were visually similar while Gg, Bb, Rr, Qq 
were visually dissimilar. To discourage use of 
the phonological loop to perform the task, 
participants performed simultaneous articu- 
latory suppression. After a retention period, 
participants had to write down the letter 
sequence in correct order and case. Logie 
found that participants made significantly 
more errors when the letter cases were vi- 
sually similar. This finding suggests the ex- 
istence of a visual store analogous to the 
phonological store in the phonological loop. 
It is possible that a visual rehearsal loop anal- 
ogous to the articulatory loop exists; how- 
ever, to date evidence is limited to introspec- 
tive accounts of mnemonics. What is clear is 
that both visual and spatial qualities of stim- 
uli can be stored in the short term; how- 
ever, the independence of systems responsi- 
ble for visual and spatial memory is the topic 
of much debate (see Logie, 1995). 

The third component of Baddeley’s work- 
ing memory model, the central executive, 
was initially a catch-all for the working- 
memory-processes necessary for certain cog- 
nitive abilities that did not fit cleanly 
into the phonological loop or visuospatial 
sketchpad. This category included many 
of the cognitive abilities discussed in this 
book, including reasoning, problem solv- 
ing, and language. For instance, Shallice and 
Warrington’s (1970) patient KF had a dras- 
tically degraded verbal span (i.e., two let- 
ters) with relatively intact language compre- 
hension. Believing that both of these abili- 
ties required working memory, Baddeley and 
Hitch (1974) reasoned that verbal span and 
language comprehension must use separate 
working-memory modules. To test this hy- 
pothesis, they devised a short-term mem- 
ory load task that balanced maintenance load 
and time (Baddeley & Hitch, 1974). For in- 
stance, a low-load condition might require 
participants to remember three numbers, 
outputting them every 2 seconds, while a 
high-load condition might require partici- 
pants to remember six numbers, outputting 


formed this secondary task while simulta- 
neously performing a primary task involv- 
ing auditory language comprehension. They 
found that language comprehension only 
suffered at high concurrent memory load 
and not under lower memory load condi- 
tions. At low memory load, participants had 
sufficient resources to carry out the compre- 
hension task; however, at high memory load 
there were insufficient resources for lan- 
guage comprehension. Adding the results of 
this study to many other similar experiments 
and the neuropsychological evidence from 
patients like KF, Baddeley and Hitch postu- 
lated that comprehension and digit span uti- 
lizes separate modules of working memory 
that taps a common resource pool. 

Given the amorphous nature of the cog- 
nitive tasks for which the central executive 
was necessary, Baddeley (1986) initially em- 
braced Norman and Shallice’s (1980; 1986) 
concept of a Supervisory Attentional Sys- 
tem as a model for the central executive. 
Norman and Shallice suggested that most 
well-learned cognitive functions operate via 
schemata, or sets of actions that run auto- 
matically. Although many schemata may be 
shared by most individuals (e.g., driving a 
car, dialing a telephone, composing a simple 
sentence, etc.), additional schemata may be 
acquired through the development of spe- 
cific expertise (e.g., writing lines of com- 
puter code, swinging a golf club, etc.). At 
many times during an ordinary day, we must 
perform more than one of these schemata 
concurrently (e.g., talking while driving). 
Norman and Shallice suggest that when we 
must perform multiple schemata, their co- 
ordination or prioritization is accomplished 
via the semi-automatic Contention Scheduler 
and the strategically controlled Supervisory 
Attentional System. The Contention Sched- 
uler uses priorities and environmental cues 
(e.g., a car quickly pulls in front of me), 
whereas the Supervisory Attentional System 
tends to follow larger goals (e.g., convinc- 
ing my wife that I’m a good driver). Thus, 
when the car rapidly pulled in front of me, 
I pressed the brake on the car and then pro- 
ceeded to tell my wife how attentive I am 


THINKING IN WORKING MEMORY 461 


while onPiteReantheahe: Pips Hshtiatanacyrcomexecutive fractionates the central executive 


teristic of the Supervisory Attentional Sys- 
tem as a model of the central executive was 
that it was sensitive to capacity limits. Ac- 
cording to Norman and Shallice, capacity 
limits constrain thinking and action during 
(1) complex cognitive processes such as rea- 
soning or decision making; (2) novel tasks 
that have not developed schemata; (3) life- 
threatening or single, difficult tasks; and (4) 
functions that require the suppression of 
habitual responses. 

Baddeley (i986) suggested that the Su- 
pervisory Attentional System provided a 
useful framework for understanding random 
generation, a task frequently associated with 
the central executive. In random genera- 
tion, a participant is asked to generate a 
series of random responses from a prede- 
termined list (e.g., integers from o to 9, 
for instance: 1,8,4,6,0,7,6, 8,4,5,6,1,2). Re- 
sponse patterns from this task usually exhibit 
two characteristics: (1) certain responses ap- 
pear at much lower frequencies than oth- 
ers (e.g, 3 or g did not appear whereas 
1,4,6, and 8 appeared repeatedly) and (2) 
stereotyped responses (e.g., 4,5,6 or 1,2) 
are much more common than other equally 
likely two- or three-number sequences (Bad- 
deley, 1966). Baddeley suggested that the 
higher-order goal of randomness is at odds 
with the dominant schemata for the pro- 
duction of numbers (i.e., counting). Thus, 
random generation potently requires the ser- 
vices of the Supervisory Attentional Sys- 
tem to override or inhibit the dominant 
schemata. When random number genera- 
tion is performed with another working- 
memory-intensive task, the resources avail- 
able to the Supervisory Attentional System 
(i-e., central executive) are in even more de- 
mand and responses become more stereo- 
typed (Baddeley et al., 1998). 

Although the Supervisory Attentional 
System describes an important ability that 
underlies complex cognitive processes such 
as language comprehension and problem 
solving, it fails to offer a tenable account 
of how, short of a homunculus, this direc- 
tion would occur. Acknowledging this prob- 
lem, Baddeley’s current model of the central 


in the hope that by understanding precisely 
what the central executive does we might 
learn how it does it. Baddeley (1996) sug- 
gested four arguably distinct central exec- 
utive functions: “(1) the capacity to coor- 
dinate performance on two separate tasks, 
(2) the capacity to switch retrieval strategies 
as reflected in random generation, (3) the 
capacity to attend selectively to one stimu- 
lus and inhibit the disrupting effect of oth- 
ers, and (4) the capacity to hold and ma- 
nipulate information in long-term memory, 
as reflected in measures of working memory 
span” (Ref. 4, p. 5). Thus, Baddeley argued 
that the central executive is important for 
task switching, inhibition of internal repre- 
sentations or prepotent responses, and the 
activation of information in long-term mem- 
ory during an activity that requires the active 
manipulation of material. In comparison to 
the slave systems, relatively little attention 
has been paid to the central executive utiliz- 
ing dual-task methodologies. 

The last and most recently added compo- 
nent of Baddeley’s working-memory model 
is the episodic buffer. One problem encoun- 
tered by a modal working-memory model is 
the need for integration. How can a com- 
plex problem requiring the integration of 
information across modalities be solved if 
all the information is being held in sepa- 
rate distinct buffers? This binding problem, 
whether it is binding information within 
a modality or across modalities, is one of 
the central challenges for a working-memory 
system capable of high-level cognition (see 
Doumas & Hummel, Chap. 4). To address 
this issue, Baddeley (2000) has proposed 
a third type of buffer that uses a multidi- 
mensional code. Thus, this buffer can main- 
tain information from several modalities that 
has been bound together by the central ex- 
ecutive. Fuster, Bodner, and Kroger (2000) 
have found evidence of the existence of neu- 
rons in prefrontal cortex that seem to be re- 
sponsible for this type of function. Another 
important function of the episodic buffer 
is serving as a scratchpad for the develop- 
ment of new mental representations during 
complex problem solving. There are many 
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tions ascribed to the episodic buffer, but the 
methods for studying such a resource utiliz- 
ing the task-interference paradigm are still 
under development. 


Embedded-Processes 
Working-Memory Model 


Although Baddeley’s multi-component 
working-memory model has dominated 
the field for much of the past thirty years, 
there are alternative conceptions of work- 
ing memory. Cowan (1988, 1995) has 
proposed a model that tightly integrates 
short- and long-term memory systems 
with attention. In his Embedded-Processes 
working-memory model (Figure 19.4), 
Cowan defines working memory as the set 
of cognitive processes that keep mental 
representations in an easily accessible state. 
Within this system, information can either 
be within the focus of attention, which 
Cowan believes is capacity limited, or 
in active memory, which Cowan suggests 
is time limited. The focus of attention is 
similar to James’s (1890) concept of primary 
memory and is equated to the information 
that is currently in conscious awareness. In 
contrast, active memory, a concept similar 
to Hebb’s (1949) cell assemblies or Ericsson 
and Kintsch’s (1995) long-term working 
memory, refers to information that has 
higher activation either from recently being 
in the focus of attention or through some 
type of automatic activation (e.g., priming). 
In the Embedded-Processes model, a central 
executive, somewhat similar to Norman 
and Shallice’s (i980, 1986) Supervisory 
Attention System, is responsible for bring- 
ing information into the focus of attention 
while an automatic recruitment of attention 
mechanism can bring information into 
active memory without previously having 
been in the focus of attention. 

A critical distinction between Cowan’s 
Embedded-Processes model and Baddeley’s 
multi-component model is how the two 
models deal with the topic of maintenance of 


deley hypothesizes modality-specific buffers 
for the short-term storage of information 
that coordinate with the Episodic Buffer, 
which is responsible for storing integrated in- 
formation. In contrast, Cowan suggests that 
information is maintained in working mem- 
ory simply by activating its representations 
in long-term memory via short-term — spe- 
cific neurons in the prefrontal or parietal cor- 
tices. This latter view suggests that informa- 
tion from different modalities will behave 
differently to the extent that they are coded 
differently in long-term memory, a view 
somewhat at odds with findings of phonolog- 
ical errors in short-term memory tasks and 
semantic errors in long-term memory tasks. 
Cowan counters this objection by noting 
that different codes are used in the storage of 
information in long-term memory and, de- 
pending on the nature of the task, different 
codes are likely to be more important. Like- 
wise, Baddeley has argued that short-term 
and long-term memory systems are distinct 
based on neuropsychological evidence sug- 
gesting that short-term and long-term sys- 
tems can be dissociated and therefore must 
be distinct systems. This argument, how- 
ever, relies to some extent on the belief that 
the individual short- and long-term systems 
are anatomically unitary, an assumption that 
seems unlikely given recent evidence from 
cognitive neuroscience. Fuster has argued, 
based on results from single-cell recording 
in nonhuman primates, that neurons in pre- 
frontal cortex are responsible for maintain- 
ing information in working memory (Fuster 
& Alexander 1971); however, disrupting cir- 
cuits between this area and more poste- 
rior or inferior regions associated with long- 
term storage of information can also result 
in working-memory deficits (Fuster, 1997). 
Recent evidence from electrophysiology in 
humans seems to confirm that areas in pre- 
frontal cortex and areas associated with long- 
term storage of information are temporally 
coactive during working-memory tasks (see 
Ruchkin et al., 2003, for a review). 

A second important distinction be- 
tween Baddeley’s multi-component 
working-memory model and Cowan’s 
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Figure 19.4. Simplified diagram of Cowan’s (1988) Embedded-Processes Model. 


Embedded-Processs model is modality 
specificity. Specifically, Baddeley has pro- 
posed independent modules within working 
memory for maintaining information from 
different modalities (e.g., visual or verbal). 
In contrast, Cowan suggests only a domain- 
general central executive that, in turn, can 
activate networks for various modalities of 
information stored in long-term memory. 
Baddeley also proposes a domain-general 
central executive, so the main distinction 
between the models is whether information 
to be maintained in working memory is 
loaded into domain-specific buffers or 
whether it is simply activated in long-term 
memory. From our earlier discussion, there 
seems to be no doubt that it is easier to 
maintain a certain quantity of information 
across several modalities than to maintain 
the same amount of information within just 
a single modality. Although this observation 
does not necessitate independent buffers, it 
does suggest that capacity limitations may 
be somewhat domain-specific. 


Reasoning and Working Memory: 
Using the Task-Interference Paradigm 


Although the task-interference paradigm 
has been very useful in exploring working 


memory slave systems, relatively little has 
been done using this technique to study 
high-level cognition or the central executive. 
Central to high-level cognitive processes is 
the ability to form and manipulate men- 
tal representations. Review of the functions 
of the central executive in either Baddeley 
or Cowan’s models suggests that the cen- 
tral executive should be critical for thinking 
and reasoning — a hypothesis that has been 
confirmed in several studies. In their semi- 
nal work on working memory Baddeley and 
Hitch (1974) asked participants to perform 
a reasoning task in which they read a sim- 
ple sentence containing information about 
the order of two abstract terms (i.e., A and 
B). Their task was to judge whether a let- 
ter sequence presented after the sentence 
reflected the order of the terms in the state- 
ment. For instance, a TRUE statement would 
be “A not preceded by B” followed by AB 
(Ref. 7, p.50). Baddeley and Hitch varied the 
statements with respect to statement voicing 
(i.e, active or passive), negation, and verb 
type (i.e., precedes or follows). They found 
that low concurrent memory loads (i.e., one 
to two items to remember) had no effect on 
reasoning accuracy or response time; how- 
ever, high concurrent memory load (i.e., six 
items to remember) had a reliable effect on 
response time. Depending on the empha- 
sis of the instructions used, they found that 
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in the reasoning task or the memory task. 
There was no statistical interaction between 
concurrent memory and the reasoning task 
difficulty. 

Several other researchers have investi- 
gated how working memory is important for 
deductive reasoning. Gilhooly et al. (1993), 
utilizing methods similar to Baddeley and 
Hitch, asked participants to perform ver 
bal syllogisms (Evans, Chap. 8, for a de- 
scription of syllogistic reasoning) of varying 
levels of complexity. In a first experiment, 
participants either viewed the premises of 
the syllogisms visually, all at once, or heard 
the premises read one at a time. Gilhooly 
et al. hypothesized that verbal presentation 
would result in a higher working-memory 
load because participants would have to 
maintain the content of the premises before 
they were able to solve the problem. They 
found this result: Participants made more er- 
rors in the verbal condition than in the vi- 
sual condition. An error analysis indicated 
that the errors made were the result of not 
remembering the premises correctly, not er- 
rors made in the process of integration of in- 
formation between premises. In a second ex- 
periment, they had participants perform the 
syllogism task visually while performing one 
of three different secondary tasks. They 
found that only random number generation 
interfered with performance of syllogisms. 
Gilhooly et al. concluded that the central 
executive is critical for relational reason- 
ing and the phonological loop (as interfered 
with by articulatory suppression) may be 
involved to a lesser extent. They also con- 
cluded that the visuospatial sketchpad, as 
interfered with by spatial tapping (i.e. tap- 
ping a fixed pattern with the fingers), was not 
important for performing verbal syllogisms 
and thus argued against models of reason- 
ing that are at least in principle dependent 
on involvement of visual working memory 
(eg., Kirby & Kosslyn, 1992; Johnson-Laird, 
1983). In a similar study, Toms, Morris, and 
Ward (1993) found no evidence that a vari- 
ety of secondary tasks loading on either the 
phonological loop or visuospatial sketchpad 


or latency. Of the secondary tasks they used, 
only a high concurrent memory load (i.e., six 
digits) affected reasoning performance, and 
this effect appeared to be limited to difficult 
syllogisms. 

Klauer, Stegmaier, and Meiser (1997) had 
participants perform syllogisms and spatial 
reasoning tasks that involved transitive in- 
ference (see Halford, Chap. 22, for a de- 
scription of transitive inference tasks). The 
spatial reasoning problems varied in com- 
plexity from simple transitive inference (e.g., 
“The circle is to the right of the trian- 
gle. The square is to the left of the tri- 
angle.” See Ref. 44, p. 13) to more com- 
plicated transitive inference problems that 
required greater degrees of relational inte- 
gration. Klauer et al. had participants per- 
form a visual tracking task (ie., follow 
one object on a screen filled with distrac- 
tor objects) while listening to the premises 
of the reasoning problems. They found 
that this visuospatial secondary task inter- 
fered with spatial reasoning but had little 
effect on syllogism performance. In another 
experiment, Klauer et al. presented syllo- 
gisms or spatial reasoning problems either 
auditorally (as in the previous experiment) 
or visually on a computer screen. While 
performing these primary tasks, participants 
performed random generation either ver- 
bally or spatially, by pressing keys in a ran- 
dom pattern. They found that both forms of 
random generation affected both syllogism 
and spatial reasoning performance; however, 
spatial random generation caused somewhat 
less interference than verbal random gener- 
ation — a finding consistent with Baddeley et 
al.’s (1998) extensive study of random gen- 
eration. In their final experiment, Klauer et 
al. found that articulatory suppression (i.e., 
counting repeatedly from 1 to 5) had a mild 
effect on syllogism and spatial reasoning la- 
tencies. Overall, Klauer et al. found evidence 
for involvement of the central executive (as 
interfered with by random generation) and 
somewhat less interference by slave system 
tasks consistent with the modality of the rea- 
soning task. 
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spatial reasoning we discussed previously, 
analogical reasoning frequently requires the 
extensive retrieval of semantic information 
in addition to the relational processing char- 
acteristic of all types of reasoning (see 
Holyoak, Chap. 6, for a detailed discussion 
of analogical reasoning). Waltz et al. (2000) 
had participants perform an analogical rea- 
soning task while performing one of several 
secondary tasks. In the analogical reasoning 
task (adapted from Markman & Gentner, 
1993), participants studied pairs of pictures 
of scenes with multiple objects (see Figure 
6.3 in Holyoak, Chap. 6). For instance, one 
problem showed a boy trying to walk a dog 
in one picture while the companion picture 
showed a dog failing to be restrained by a 
leash tied to a tree. Participants were asked to 
study each picture and pick one object in the 
second picture that “goes with” a target ob- 
ject in the first picture. In the example prob- 
lem in Figure 6.3, the man in the first picture 
is a featural match to the boy in the second 
picture while using an analogy the boy is a 
relational match to the tree in the second 
picture. Participants were simply asked to se- 
lect one object; they were thus free to com- 
plete the task based on either featural sim- 
ilarity or make an analogical mapping and 
inference, answering based on relational sim- 
ilarity. Waltz et al. found that participants 
who maintained a concurrent memory load 
or performed verbal random number gener- 
ation or articulatory suppression (i.e., saying 
the word “the” once each second) gave fewer 
relational responses than a control group not 
performing a dual task. In a recent extension 
with this task, my lab replicated Waltz et al.'s 
articulatory suppression finding (i.e., saying 
the English nonword “zorn” once each sec- 
ond) and also found a similar effect for a vi- 
suospatial working-memory dual task (man- 
ually tapping a simple spatial pattern). 

In the previous studies, the extent of in- 
terference with the analogy task was similar 
for both central executive (concurrent mem- 
ory load and verbal random number gen- 
eration) and slave system (articulatory sup- 
pression and spatial tapping) dual tasks. One 


reasoning is more resource demanding than 
the deductive and spatial reasoning tasks 
previously discussed, and thus even the slave 
system tasks cause significant interference. 
Another possibility is that analogical reason- 
ing places greater demands specifically on 
the modality-specific slave systems of work- 
ing memory than other forms of relational 
reasoning. To investigate this issue, Morri- 
son, Holyoak, and Truong (2001) had par- 
ticipants perform either a verbal or visual 
analogy task, while performing articulatory 
suppression (i.e., saying the nonword “zorn” 
once a second), spatial tapping (i.e., touch- 
ing one of four red dots each second in a 
predetermined pattern), or verbal random 
number generation. In the verbal analogy 
task, participants verified verbal analogies, 
such as BLACK:WHITE::NOISY:QUIET, 
answering either TRUE or FALSE via a floor 
pedal. In the visual analogy task, participants 
performed Sternberg’s (1977) People Pieces 
analogy task. In this task, participants ver- 
ify whether the relational pattern of charac- 
teristics between two cartoon characters is 
the same or different than between a sec- 
ond pair of characters. Morrison, Holyoak, 
and Truong found that, for verbal analogies, 
articulatory suppression and verbal random 
number generation resulted in an increase in 
analogy error rate, whereas only verbal ran- 
dom number generation increased analogy 
response time for correct responses. Spatial 
tapping had no reliable effect on verbal anal- 
ogy performance. In contrast, for visual anal- 
ogy, both spatial tapping and verbal random 
number generation resulted in more analogy 
errors, whereas only random generation in- 
creased analogy response time. Articulatory 
suppression had no reliable effect on visual 
analogy performance. Thus, there seems to 
be amodality-specific role for working mem- 
ory in analogical reasoning. 

In summary, all of the reasoning tasks 
described in the previous section are inter- 
fered with by dual tasks considered to tap 
the central executive (e.g., random number 
generation or concurrent memory load). 
The deductive reasoning tasks reported 
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of premises that are provided in the prob- 
lem. In addition to these operations, analog- 
ical reasoning may require the reasoner to 
retrieve information from semantic memory 
(eg., the relations that bind the terms in the 
analogy) and then map the resulting rela- 
tional statements (and in some cases make 
an inference that requires retrieving a term 
that completes the analogy). 

To evaluate the extent that working- 
memory resources are necessary for seman- 
tic memory retrieval and relational binding, 
my lab went on to examine the compo- 
nent processes in working memory is neces- 
sary for analogical reasoning. We wondered 
whether working memory is necessary for 
the simple process of relational binding or 
only becomes necessary when multiple rela- 
tions need to be maintained and compared 
during the analogical mapping process. To 
address this question, we used the stim- 
uli from the verbal analogy task but simply 
asked participants to verify relational state- 
ments instead of comparing two of them as 
in the analogy task. Thus, participants would 
respond TRUE to a statement like “black is 
the opposite of white” and FALSE to the 
statement “noisy is the opposite of nois- 
ier.” As in the verbal analogy task, articula- 
tory suppression and verbal random number 
generation affected performance with spa- 
tial tapping also having a smaller, but reli- 
able effect. Thus, relational binding, not just 
maintenance and mapping, require use of 
the working-memory system, including the 
modality-specific slave systems. 


Individual Differences in 
Working Memory 


An alternative to Baddeley’s dual-task me- 
thodology uses individual differences to 
study working memory. Daneman and Car- 
penter (1980) first used this approach to 
investigate how working memory was in- 
volved in language comprehension. They de- 
veloped a reading span task that required 
subjects to read several sentences and then 


the correct order. The participant’s span is 
typically defined as the maximum-sized trial 
with perfect performance. This measure cor- 
related relatively well with individuals’ read- 
ing comprehension ability. Unlike a simple 
short-term memory-span task, the working- 
memory-span task required the subjects to 
do a more complex task while also remem- 
bering a list of items. In this way, the span 
task is believed to tap both the mainte- 
nance (slave system) and manipulation (cen- 
tral executive and episodic buffer) aspects 
of working memory. Other span tasks have 
been developed to vary the nature of the 
task that participants perform and what 
they maintain. For example, Turner and En- 
gle (1989) asked participants to solve sim- 
ple arithmetic problems and then remem- 
ber a word presented at the end of each 
problem. In the n-back task (Figure 19.5; 
Smith & Jonides, 1997 for a complete de- 
scription), the manipulation task is changed 
to having to continuously update the set 
of items. Using this approach, researchers 
have found working-memory capacity to be 
an important predictor of performance in a 
broad range of higher cognitive tasks, includ- 
ing reading comprehension (Daneman & 
Carpenter, 1980), language comprehension 
(Just & Carpenter, 1992), following direc- 
tions (Engle, Carullo, & Collins, 1991), rea- 
soning (Carpenter, Just, & Shell, 1990; Kyllo- 
nen & Christal, 1990), and memory retrieval 
(Conway & Engel, 1994). 

Researchers using working-memory-span 
measures typically measure participants’ 
working-memory span using one or more 
measures and then use this to predict per- 
formance on another task. A high correlation 
suggests that working memory is an impor- 
tant target for the task. More sophisticated 
studies collect a variety of other measures of 
information processing ability (e.g., process- 
ing speed or short-term memory span) and 
use either multiple regression or structural 
equation modeling to determine whether 
these various abilities are separable with 
respect to the target task. Engle and his 
collaborators (Engle, Kane, & Tuholski, 
1999; Kane & Engle, 2003b; see also 
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Figure 19.5. The n-back task. Participants see a stream of letters, numbers, or 
symbols and have to continuously answer whether the current item was the 
same as the item presented “n-back” in the stream. This task requires 
maintenance of the current in-set item and continuous updating of this set — an 
ability considered to be manipulation of the set. 


Salthouse, Chap. 24) have used this ap- 
proach to argue that, although working- 
memory-span and short-term-memory-span 
tasks share much variance, it is working- 
memory capacity that best predicts higher 
cognitive performance as measured by tasks 
such as the Ravens Progressive Matrices (see 
Figure 19.6). 

Kane and Engle believe that the ability 
measured by a working-memory-span task 
once simple maintenance is stripped away is 
best described as controlled attention. They 
have argued that working-memory capacity 
is a good predictor of task performance in 
tasks that (a) require maintenance of task 
goals, (b) require scheduling competing ac- 
tions or responses, (c) involve response com- 
petition or (d) involve inhibiting informa- 
tion irrelevant to the task (Engle, Kane, & 
Tuholski, 1999). This list is very similar to 
the functions that Baddeley (1996) attribu- 
tes to the central executive. Obviously, these 
are the types of cognitive processes that 


are omnipresent in high-level cognition. 
They are also the types of cognitive abil- 
ities necessary to perform traditional tests 
of fluid or analytical intelligence such as 
the Ravens Progressive Matricies (1938), 
leading researchers to hypothesize that 
working-memory capacity is the critical 
factor that determines analytical intelli- 
gence (see Kane & Engle, 2003 a; Sternberg, 


Chap, 34). 


The Where, What, and How 
of Working Memory and Thought 


So far, we have suggested that there are 
at least two important aspects of working 
memory for human thinking — a modality- 
specific maintenance function that is ca- 
pable of preserving information over short 
periods of time and a manipulation or at- 
tentional control function that is capable 
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Figure 19.6. Structural Equation Model of the relationship of working memory and short-term 
memory and their role in analytic problem solving and intelligence. From Engle, Kane, and Tuholski 


(1999). 


of activating, operating, and updating this 
information during conscious thought. Re- 
cently, cognitive neuroscientists have de- 
voted much effort to answering the question 
of where in the brain these working mem- 
ory mechanisms operate. This topic is be- 
yond the scope of this chapter [see Goel, 
Chap. 20, for a more detailed treatment of 
the cognitive neuroscience of problem solv- 
ing and Chein, Ravizza, and Fiez (2003) for 
a recent appraisal of the ability of Baddeley 
and Cowan’s models to account for recent 
neuroimaging findings]; however, we know 
that at least several areas of the prefrontal 
and parietal corticies are critical for these 
functions. Although these areas may be spe- 
cific to working memory, there is mount- 
ing evidence from both electrophysiology 
and functional magnetic resonance imaging 
(fMRI) that working memory is the result 
of activation of networks involving many 
brain regions.*? A more interesting question 
than where, is how working memory op- 
erates thinking. Unfortunately, much less 
attention has been given to this question; 
however, several of the computational ap- 
proaches outlined in this book begin to ad- 
dress this topic.? 


It is the belief of many of the authors in 
this volume that high-level cognition is in- 
trinsically relational in nature, a position long 
argued by many scientists (see Fodor and 
Pylyshyn, 1988; Spearman, 1923). In this 
account, one critical function for working 
memory to accomplish is the flexible binding 
of information stored in long-term memory. 
Working memory must also be able to nest 
relations to allow more complex knowledge 
structures to be used. Halford (Chap. 22) has 
referred to this factor as relational complexity. 
As the relational complexity of a particular 
problem increases, so do the demands placed 
on working memory. Goals are a particu- 
lar subclass of relations that are especially 
important in deductive reasoning (see Goel, 
Chap. 20). Maintaining the complex goal hi- 
erarchies (high relational complexity) nec- 
essary for solving complex problems such as 
those encountered in chess or in tasks such 
as the Ravens Progressive Matrices or the 
Tower of Hanoi makes great demands on 
the working memory system (see Lovett & 
Anderson, Chap. 17; Carpenter, Just, & 
Shell, 1990; Newman et al., 2003). Most 
work directed at understanding how the 
brain implements working memory has 
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processing of relations is minimal. 

The ways in which the brain’s distributed 
architecture is used to process problems 
that require relation flexibility and relational 
complexity have just begun to be explored 
(see Christoff & Gabrieli, 2000; Christoff et 
al., 2001; Morrison et al., 2004; Prabhakaran 
et al., 2000; Waltz et al., 1999). Hum- 
mel and Holyoak’s (1997, 2003; see also 
Doumas & Hummel, Chap. 4) LISA model 
solves the binding problem created by the 
need for the flexible use of information in 
a distributed architecture. The LISA model 
dynamically binds roles to their fillers in 
working memory by temporal synchrony of 
firing. This allows the distributed informa- 
tion in long-term memory to be flexibly 
bound in different relations and for the sys- 
tem to appreciate that the various entities 
can serve different functions in different re- 
lations and relational hierarchies. It is possi- 
ble that one role of the prefrontal cortex is to 
control this synchrony process by firing the 
distributed network of neurons representing 
the actual fillers in long-term memory (see 
Doumas & Hummel Chap. 4, and Morrison 
et al., 2004, for a more detailed account of 
this approach). Although no direct evidence 
exists for synchrony of binding in high-level 
relational systems, several studies in animals 
(e.g., Gray et al., 1989) and in humans (e.g., 
Miller et al., 1996; Ruchkin et al., 2003) 
suggest that synchrony may be an important 
mechanism for other cognitive processes im- 
plemented in the brain. This type of sys- 
tem is also consistent with Baddeley’s (2000) 
concept of an episodic buffer that binds in- 
formation together in working memory. 

Implicit in a working-memory system ca- 
pable of handling relations is not only the 
ability to precisely activate information in 
long-term memory but also the ability to 
deactivate or inhibit it. Consider the simple 
analogy problem: 


BLACK: WHITE::NOISY: ?_ (1) QUIET 
(2) NOISIER 


If the semantic association between NOISY 
and NOISIER is stronger than that between 


response, QUIET, may initially be less active 
because of spreading activation in memory 
than the distractor item, NOISIER. Thus, 
during reasoning, it may be necessary to in- 
hibit information that is highly related but 
inconsistent with the current goal (Morri- 
son et al., 2004). This function of work- 
ing memory has also been ascribed to the 
prefrontal cortex (see Kane & Engle, 2003b 
and 2003; Miller & Cohen, 2001; and Shi- 
mamura, 2000, for reviews). Many complex 
executive tasks associated with frontal lobe 
functioning (e.g., Tower of Hanoi or Lon- 
don, Analogical Reasoning, Wisconsin Card 
Sorting) have important inhibitory compo- 
nents [Miyake et al., 2000; Morrison et al., 
2004; Viskontas et al. (in press), and Welsh, 
Satterlee-Cartmell, & Stine, 1999]. Shima- 
mura (2000) suggested that the role of pre- 
frontal cortex is to filter information dynam- 
ically — a process that requires the use of 
both activation and inhibition to keep infor- 
mation in working memory relevant to the 
current goal. Miller and Cohen (2001) ar 
gued that “the ability to select a weaker, task- 
relevant response (or source of information) 
in the face of competition from an otherwise 
stronger, but task-irrelevant one [is one of 
the most] fundamental aspects of cognitive 
control and goal-directed behavior” (Ref. 48, 
p. 170) and is a property of prefrontal cortex. 
More generally, many researchers believed 
that inhibition is an important mechanism 
for complex cognition (see Dagenbach & 
Carr, 1994; Dempster & Brainerd, 1995; and 
Kane & Engle, 2003 a, for reviews) and that 
changes in inhibitory control may explain 
important developmental trends (Bjorklund 
& Harnishfeger, 1990; Hasher & Zacks, 1988; 
Diamond, 1990) and individual differences 
(Dempster, 1991; Kane & Engle, 2003<, 
2003b) in complex cognition. 


Conclusions and Future Directions 
Working memory is a set of central processes 


that makes conscious thought possible. It 
flexibly provides for the maintenance and 
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activation and inhibition of information re- 
trieved from long-term memory and newly 
accessed from perception. Relations are crit- 
ical to thought and the working-memory sys- 
tem therefore must provide for the flexible 
binding of information. It also allows the 
problem solver to maintain goals that allow 
successful navigation of single problems but 
also allows for integration of various parts of 
larger problems. Working-memory capacity 
is limited, and this is an important individ- 
ual difference that affects and perhaps even 
determines analytic intelligence. We know 
that working memory is critically dependent 
on prefrontal cortex functioning, but likely 
involves the successful activation and inhi- 
bition of large networks in the brain. Main- 
tainence of information in working mem- 
ory tends to be somewhat modality specific; 
however, attentional resources typically as- 
cribed to a central executive tend to be 
more modality independent and allow for 
the connection of information from differ- 
ent modalities. 

The future of working memory research 
resides in better understanding how these 
processes operate in the brain. Computa- 
tional approaches allow researchers to make 
precise statements about functional pro- 
cesses necessary for a working-memory sys- 
tem to perform thinking and can provide 
useful predictions for evaluation with cogni- 
tive neuroscience methods. Whereas much 
effort has been placed on understanding 
where working memory resides in the cor- 
tex, much less attention has focused on how 
it functions. Understanding the neural pro- 
cesses underlying working memory will al- 
most certainly require tight integration of 
methods that provide good spatial localiza- 
tion (e.g., fMRI) and good temporal informa- 
tion (e.g., electrophysiology) in the brain. 
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Notes 


1. The term “working memory” was originally 
used to describe rat behavior during radial arm 
maze learning [see Olton (1979) for a de- 
scription of this literature). It was also used 
by Newell and Simon (1972)] to describe the 
component of their computational models that 
holds productions — that is, operations that 
the model must perform (see also Lovett and 


Anderson, Chap. 17). 


2. Fuster (1997) has long argued for this approach 
to working memory based on electrophysiolog- 
ical and cortical cooling data from nonhuman 
primates. In Fuster’s model, neurons in pre- 
frontal cortex drive neurons in more posterior 
brain regions that code for the information to 
be activated in long-term memory. This per- 
spective is also consistent with Cowan’s (1988) 
Embedded-Processes model. See also Chein, 
Ravizza, and Fiez, 2003. 

3. Both ACT (Lovett and Anderson, Chap. 17) 
and LISA (Doumas and Hummel, Chap. 4) 
provide accounts of how working memory may 
be involved in higher-level cognition. These 
theories and computational implementations 
provide excellent starting points for investi- 
gating how the brain actually accomplishes 
high-level thought. An excellent edited vol- 
ume by Miyake and Shah (1999) reviews many 
of the traditional computational perspectives 
on working memory. 


References 


Atkinson, R. C. & Shiffrin, R. M. (1968). Hu- 
man memory: A proposed system and its con- 
trol processes. In K. W. Spence, & J. T. Spence 
(Eds.), The Psychology of Learning and Motiva- 
tion Vol. 2 (pp. 89-195). New York: Academic 
Press. 

Baddeley, A. D. (1966). The capacity for gener- 
ating information by randomization. Quarterly 
Journal of Experimental Psychology, 18, 119-29. 


THINKING IN WORKING MEMORY 471 


Baddeley, Prepertese how iting (A4éhiiianargocom dependent inhibition model. Journal of Experi- 


ford, UK: Oxford University Press. 

Baddeley, A. D. (1996). Exploring the central ex- 
ecutive. Quarterly Journal of Experimental Psy- 
chology, 49A, 5-28. 

Baddeley, A. D. (2000). The episodic buffer: A 
new component of working memory? Trends 
in Cognitive Sciences, 4, 417-423. 

Baddeley, A., Emslie, H., Kolodny, J., & Dun- 
can, J. (1998). Random generation and the ex- 
ecutive control of working memory. Quarterly 
Journal of Experimental Psychology, 51. A, 819- 
852. 

Baddeley, A., & Hitch, G. J. (1974). Working 
memory. In G. H. Bower (Ed.), The Psychology 
of Learning and Motivation Vol. 8 (pp. 47-89). 
New York: Academic Press. 

Bjorklund, D. EF, & Harnishfeger, K. K. 
(1990). The resources construct in cognitive 
development: Diverse sources of evidence and 
a theory of inefficient inhibition. Developmen- 
tal Review, 10, 48-71. 

Brooks, L. R. (1968). Spatial and verbal compo- 
nents of the act of recall. Canadian Journal of 
Psychology, 22, 349-368. 

Carpenter, P. A., Just, M. A., & Shell, P. (1990). 
What one intelligence test measures: A theo- 
retical account of the processing in the raven 
progressive matrices test. Psychological Review, 
971 404-431. 

Cattell Culture Fair Test (Institute for Personality 
and Ability Testing, 1973). Champaign, IL. 
Chein, J. M., Ravizza, S. M., & Fiez, J. A. 

(2003). Using neuroimaging to evaluate mod- 
els of working memory and their implications 
for language processing. Journal of Neurolin- 

guistics, 16, 315-339. 

Christoff, K., & Gabrieli, J. D. E. (2000). The 
frontopolar cortex and human cognition: Evi- 
dence for a rostrocaudal hierarchical organiza- 
tion within the human prefrontal cortex. Psy- 
chobiology, 28, 168-186. 

Christoff, K., Prabhakaran, V., Dorfan, J., Zhao, 
Z., Kroger, J. K., Holyoak, K. J. et al. (2001). 
Rostrolateral prefrontal cortex involvement in 
relational integration during reasoning. Neu- 
roimage, 14, 1136-1149. 

Conrad, R. (1964). Acoustic confusion in imme- 
diate memory. British Journal of Psychology, 55, 
75-84. 

Conway, A. R. A., & Engle, R. W. (1994). 
Working memory and retrieval: A resource- 


mental Psychology: General, 123, 354-373. 

Cowan, N. (1988). Evolving conceptions of mem- 
ory storage, selective attention, and their mu- 
tual constraints within the human information 
processing system. Psychological Bulletin, 104, 
163-101. 

Cowan, N. (1995). Attention and Memory: An In- 
tegrated Framework. New York: Oxford Univer- 
sity Press. 


Cowan, N. (1999). An embedded-processes 
model of working memory. In Akira Miyake & 
Priti Shad (Eds.) Models of Working Memory: 
Mechanisms of Active Maintenance and Exec- 
utive Control (pp. 62-101). Cambridge, UK: 
Cambridge University Press. 


Dagenbach, D. E., & Carr, T. H. (1994). Inhibitory 
Processes in Attention, Memory, and Language. 
San Diego, CA: Academic Press. 


Daneman, M., & Carpenter, P. A. (1980). Individ- 
ual differences in working memory and read- 
ing. Journal of Verbal Learning and Verbal Be- 
havior, 19, 450-466. 

Dempster, F. N. (1991). Inhibitory processes: 
A neglected dimension of intelligence. Intelli- 
gence, 15, 157-173. 

Dempster, F. N., & Brainerd, C. J. (Eds.). (1995). 
Interference and inhibition in cognition. San 
Diego, CA: Academic Press. 

de Renzi, E., & Nichelli, P. (2975). Verbal and 
non-verbal short-term memory impairment 
following hemispheric damage. Cortex, 11, 
341-354. 

Diamond, A. (1990). The development and neu- 
ral bases of memory functions as indexed by 
the A-not-B and delayed response tasks in hu- 
man infants and infant monkeys. In Diamond 
A. (Ed.), The Development and Neural Bases of 
Higher Cognitive Functions (pp. 267-317). New 
York: New York Academy of Sciences. 

Engle, R. W., Carullo, J. J, & Collins, K. W. 
(1991). Individual differences in working mem- 
ory for comprehension and following direc- 
tions. Journal of Educational Research, 84, 253- 
262. 


Engle, R. W., Kane, M. J., & Tuholski, S. W. 
(1999). Working memory and controlled atten- 
tion. In Akira Miyake & Priti Shad (Eds.), Mod- 
els of Working Memory: Mechanisms of Active 
Maintenance and Executive Control (pp. 102- 
134). Cambridge, UK: Cambridge University 
Press. 


472 THE CAMBRIDGE HANDBOOK OF THINKING AND REASONING 


Ericsson, K. APSGS@Rtaedahu dtp Aagttianaryitaia M. J., & Engle, R. W. (20034). The role 


working memory. Psychological Review, 102, 
211-245. 

Fodor, J. A., & Pylyshyn, Z. W. (1988). Connec- 
tionism and cognitive architecture: A critical 
analysis. In S. Pinker & J. Mehler (Eds.), Con- 
nections and Symbols (pp. 3-71). Cambridge, 
MA: MIT Press. 

Fuster, J. M. & Alexander, G. E. (1971). Neuron 
activity related to short-term memory. Science, 
173, 652-654. 

Fuster, J. M. (1997). The Prefrontal Cortex: 
Anatomy, Physiology, and Neuropsychology of 
the Frontal Lobe. (3rd ed.). Philadelphia: 
Lippincott-Raven. 

Fuster, J. M., Bodner, M., & Kroger, J. K. (2000). 
Cross-modal and cross-temporal association in 
neurons of frontal cortex. Nature, 405, 347- 
351. 

Gilhooly, K. J., Logie, R. H., Wetherick, N. E., & 
Wynn, V. (1993). Working memory and strate- 
gies in syllogistic-reasoning tasks. Memory and 
Cognition, 21, 115-124. 

Gray, C. M., Koenig, P., Engel, A. K., & Singer, 
W. (1989). Oscillatory responses in cat visual 
cortex exhibit inter-columnar synchronization 
which reflects global stimulus properties. Na- 
ture, 338: 334-337. 

Hasher, L., & Zacks, R. T. (1988). Working mem- 
ory, comprehension, and aging: A review anda 
new view. In G. Bower (Ed.), The Psychology of 
Learning and Motivation: Advances in Research 
and Theory (Vol. 22, pp. 193-225). San Diego, 
CA: Academic Press. 

Hebb, D. O. (1949). The Organization of Behavior. 
New York: Wiley. 


Hummel, J. E., & Holyoak, K. J. (1997). Dis- 
tributed representations of structure: A theory 
of analogical access and mapping. Psychological 
Review, 104, 427-466. 

Hummel, J. E., & Holyoak, K. J. (2003). A 
symbolic-connectionist theory of relational in- 
ference and generalization. Psychological Re- 
view, 110, 220-264. 

James, W. (1890). Principles of Psychology. New 
York: Henry Holt & Co. 

Johnson-Laird, P. N. (1983). Mental Mod- 
els. Cambridge, UK: Cambridge University 
Press. 

Just, M. A., & Carpenter, P. A. (1992). A capac- 
ity theory of comprehension: Individual differ- 
ences in working memory. Psychological Review, 


99, 122-149. 


of prefrontal cortex in working-memory ca- 
pacity, executive attention, and general fluid 
intelligence: An individual-differences per- 
spective. Psychonomic Bulletin and Review, 9, 
637-671. 

Kane, M. J., & Engle, R. W. (2003b). Working- 
memory capacity and the control of attention: 
The contributions of goal neglect, response 
competition, and task set to Stroop interfer- 
ence. Journal of Experimental Psychology: Gen- 
eral, 132, 47-70. 

Kirby, K. N., & Kosslyn, S. M. (1992). Thinking 
visually. In G. W. Humphreys (Ed.), Under- 
standing Vision: An Interdisciplinary Perspective. 
(pp. 71-86). Oxford, UK: Blackwell. 

Klauer, K. C., Stegmaier, R., & Meiser, T. (1997). 
Working memory involvement in propositional 
and spatial reasoning. Thinking and Reasoning, 
319-47. 

Kyllonen, P. C., & Christal, R. E. (1990). Reason- 
ing ability is (little more than) working- mem- 
ory capacity?! Intelligence, 14, 389-433. 

Logie, R. H. (1995). Visuo-spatial Working Mem- 
ory. Hillsdale, NJ: Erlbaum. 

Markman, A. B., & Gentner, D. (1993). Struc- 
tural alignment during similarity comparisons. 
Cognitive Psychology, 25, 431-467. 

Miller, E. K., & Cohen, J. D. (2001). An integra- 
tive theory of prefrontal cortex function. An- 
nual Review of Neuroscience, 24, 167-202. 

Miller, G. A. (2956). The magical number seven, 
plus or minus two: Some limits on our capac- 
ity for processing information. Psychological Re- 
view, 63, 81-97. 

Milner, B. (1966). Amnesia following operation 
on the temporal lobes. In C. W. M. Whitty & 
O. L. Zangwill (Eds.), Amnesia (pp. 109-133). 
London: Butterworths. 

Miyake, A., Friedman, N. P., Emerson, M. J., 
Witzki, A. H., & Howerter, A. (2000). The 
unity and diversity of executive functions and 
their contributions to complex ‘frontal lobe’ 
tasks: A latent variable analysis. Cognitive Psy- 
chology, 41, 49-100. 

Miyake, A. & Shah, P. (1999). Models of working 
memory: Mechanisms of active maintenance and 
executive control. New York: Cambridge Uni- 
versity Press. 


Morrison, R. G., Holyoak, K. J., & Truong, 
B. (2001). Working memory modularity in 
analogical reasoning. Proceedings of the Twen- 
tyThird Annual Conference of the Cognitive 


THINKING IN WORKING MEMORY 473 


Science Bodeie Seas? oy Hetys NGAWHIBNAG ICOM systems: A state of activated long-term mem- 


Erlbaum. 


Morrison, R. G., Krawczyk, D. C., Holyoak, K. J., 
Hummel, J. E., Chow, T. W., Miller, B. L., 
et al. (2004). A neurocomputational model 
of analogical reasoning and its breakdown in 
Frontotemporal Lobar Degeneration. Journal of 
Cognitive Neuroscience, 16, 1-11. 

Miller, M. M., Bosch, J., Elbert, T., Kreiter, A., 
Sosa, M. V., Sosa, P. V., et al. (1996). Visu- 
ally induced gamma-band responses in human 
electroencephalographic activity: A link to an- 
imal studies. Experimental Brain Research, 112, 
96-102. 

Murdock, B. B., Jr. (1962). The retention of in- 
dividual items. Journal of Experimental Psychol- 
ogy, 64, 482-488. 

Newman, S. D., Carpenter, P. A., Varma, S. & 
Just, M. A. (2003). Frontal and parietal par- 
ticipation in problem solving in the Tower of 
London: fMRI and computational modeling of 
planning and high-level perception. Neuropsy- 
chologia, 41, 2003. 

Newell, A., & Simon, H. A. (1972). Human prob- 
lem solving. Englewood Cliffs, NJ: Prentice- 
Hall. 

Norman, D. A., & Shallice, T. (1980). Atten- 
tion to action: Willed and automatic control of 
behaviour. University of California, San Diego 
CHIP Report gg. 

Norman, D. A., & Shallice, T. (1986). Attention 
to action: Willed and automatic control of be- 
havior. In G. E. Schwartz & D. Shapiro (Eds.), 
Consciousness and Self-Regulation. New York: 
Plenum Press. 

Olton, D. S. (1979). Mazes, maps and memory. 
American Psychologist, 34, 583-596. 

Prabhakaran, V., Narayanan, K., Zhao, Z., & 
Gabrieli, J. D. E. (2000). Integration of di- 
verse information in working memory within 
the frontal lobe. Nature Neuroscience, 3 (1), 85- 
go. 

Raven, J. C. (1938). Progressive Matrices: A Per- 
ceptual Test of Intelligence, Individual Form. 
London: Lewis. 

Ruchkin, D. S., Grafman, J., Cameron, K., & 
Berndt, R. (2003). Working memory retention 


ory. Behavioral and Brain Sciences, 26, 709- 
777: 

Shallice, T., & Warrington, E. K. (1970). Inde- 
pendent functioning of verbal memory store: 
A neuropsychological study. Quarterly Journal 
of Experimental Psychology, 22, 261-273. 

Shimamura, A. P. (2000). The role of the pre- 
frontal cortex in dynamic filtering. Psychobiol- 
ogy, 28, 207-218. 

Smith, E. E., & Jonides, J. (1997). Working 
memory: A view from neuroimaging. Cognitive 
Psychology, 33, 5-42. 

Spearman, C. (1923). The Nature of Intelligence 
and the Principles of Cognition. London, UK: 
MacMillian. 

Sternberg, R. J. (1977). Intelligence, Information 
Processing, and Analogical Reasoning: The Com- 
ponential Analysis of Human Abilities. Hillsdale, 
NJ: Erlbaum. 

Toms, M., Morris, N., & Ward, D. (1993). Work- 
ing memory and conditional reasoning. Quar- 
terly Journal of Experimental Psychology: Human 
Experimental Psychology, 46, 679-699. 

Turner, M. L. & Engle, R. W. (1989). Is work- 
ing memory capacity task dependent? Journal 
of Memory and Language, 28, 127-154. 

Viskontas, I. V., Morrison, R. G., Holyoak, K. J., 
Hummel, J. E., & Knowlton, B. J. (in press). 
Relational integration, inhibition and analog- 
ical reasoning in older adults. Psychology and 
Aging. 

Waltz, J. A., Knowlton, B. J., Holyoak, K. J., 
Boone, K. B., Mishkin, F. S., de Menezes San- 
tos, M., et al. (24999). A system for relational 
reasoning in human prefrontal cortex. Psycho- 
logical Science, 10, 119-125. 

Waltz, J. A., Lau, A., Grewal, S. K., & Holyoak, 
K. J. (2000). The role of working memory in 
analogical mapping. Memory and Cognition, 28, 
1205-1212. 

Waugh, N. C. & Norman, D. A. (1965). Primary 
memory. Psychological Review, 72, 89-104. 

Welsh, M. C., Satterlee-Cartmell, T., & Stine, M. 
(1999). Towers of hanoi and london: Contribu- 
tion of working memory and inhibition to per- 
formance. Brain and Cognition, 41 , 231-242. 


Préentatedyy: Inttps /iAatitianaegteom 


Prevertete bby: nttas /4etitiianargatom 


CHAPTER 20 


Cognitive Neuroscience 
of Deductive Reasoning 


Vinod Goel 


Introduction 


It is 4 p.m. and I hear the school bus pull up 
to the house. Soon there is the taunting of 
a 13-year-old boy followed by the exagger- 
ated screams of an 8-year-old girl. My kids 
are home from school. Exasperated, I say to 
my son, “If you want dinner tonight, you bet- 
ter stop tormenting your sister.” Given he 
doesn’t want to go to bed hungry, he needs 
to draw the correct logical inference. Sure 
enough, peace is eventually restored. We are 
not surprised by his actions. His behavior is 
not a mystery (if he wants his dinner). It 
is just an example of the reasoning brain at 
work. 

Reasoning is the cognitive activity of 
drawing inferences from given information. 
All reasoning involves the claim that one 
or more propositions (the premises) pro- 
vide some grounds for accepting another 
proposition (the conclusion). The aforemen- 
tioned example involves a deductive infer 
ence (see Evans, Chap. 8). A key feature 
of deduction is that conclusions are con- 
tained within the premises and are logically 


independent of the content of the proposi- 
tions. Deductive arguments can be evaluated 
for validity, a relationship between premises 
and conclusion involving the claim that 
the premises provide absolute grounds for 
accepting the conclusion (i.e., if the pre- 
mises are true, then the conclusion must be 
true). 


Psychological Theories of 
Deductive Reasoning 


Two theories of deductive reasoning (men- 
tal logic and mental models) dominate the 
cognitive literature. They differ with respect 
to the competence knowledge upon which 
they draw, the mental representations they 
postulate, the mechanisms they invoke, and 
the neuroanatomical predictions they make. 
Mental logic theories (Braine, 1978; Henle, 
1962; Rips, 1994) postulate that reason- 
ers have an underlying competence knowl- 
edge of the inferential role of the closed- 
form, or logical terms, of the language (e.g., 
“all, some, none, and,” etc.). The internal 
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structural properties of the propositional 
strings in which the premises are stated. A 
mechanism of inference is applied to these 
representations to draw conclusions from 
premises. Essentially, the claim is that de- 
ductive reasoning is a rule-governed process 
defined over syntactic strings. 

By contrast, mental model theory 
(Johnson-Laird, 1983; Johnson-Laird & 
Byrne, 1991; see Johnson-Laird, Chap. 9) 
postulates that reasoners have an underlying 
competence knowledge of the meaning of 
the closed-form, or logical terms, of the 
language (e.g., “all, some, none, and,” etc.)! 
and use this knowledge to construct and 
search alternative scenarios.? The internal 
representations of arguments preserve the 
structural properties of the world (e.g., spa- 
tial relations) that the propositional strings 
are about, rather than the structural proper- 
ties of the propositional strings themselves. 
The basic claim is that deductive reasoning 
is a process requiring spatial manipulation 
and search. 

A third alternative is provided by dual 
mechanism theories. At a very crude level, 
dual mechanism theories make a distinc- 
tion between formal, deliberate, rule-based 
processes and implicit, unschooled, auto- 
matic processes. However, dual mechanism 
theories come in various flavors that dif- 
fer on the exact nature and properties of 
these two systems. Theories differentially 
emphasize explicit and implicit processes 
(Evans & Over, 1996), conscious and precon- 
scious processes (Stanovich & West, 2000), 
formal and heuristic processes (Newell & 
Simon, 1972; also see Kahneman & Fred- 
erick, Chap. 12), and associative and rule- 
based processes (Goel, 1995; Sloman, 1996). 
The relationship among these proposals has 
yet to be clarified. 


Relevance and Role 
of Neurophysiological Data 


The reader will note that these theories 
of reasoning are strictly cognitive theories 


is not an oversight. Until recently, the central 
domains of human reasoning and problem 
solving have been largely cognitive and com- 
putational enterprises, with little input from 
neuroscience. In fact, an argument advanced 
by cognitive scientists — based on the in- 
dependence of computational processes and 
the mechanism in which they are realized 
(i.e., the brain) — has led many to question 
the relevance of neuropsychological data to 
cognitive theories. 

The “independence of computational 
level” argumentis a general argument against 
the necessity of appealing to neurophysi- 
ology to capture the generalizations nec- 
essary to explain human mental life. The 
general idea is that liberation from neuro- 
physiology is one of the great virtues of the 
cognitive/computational revolution. It gives 
us the best of both worlds. It allows us to 
use an intentional vocabulary in our psy- 
chological theories, and if this vocabulary 
meets certain (computational) constraints, 
we get a guarantee (via the Church-Turing 
hypothesis) that some mechanism will be 
able to instantiate the postulated process.? 
Beyond this, we don’t have to worry about 
the physical. The psychological vocabu- 
lary will map onto the computational vo- 
cabulary, and it is, after all, cognitive/ 
computational structure, not physical struc- 
ture, that captures the psychologically inter- 
esting generalizations. 

The argument can be articulated as 
follows: 


(Pi) There are good reasons to believe 
that the laws of psychology need to be 
stated in intentional vocabulary (Fodor, 
1975; Pylyshyn, 1984). 

(P2) Computation (sort of) gives us such 
a vocabulary (Cummins, 1989; Fodor, 
1975; Goel, 1991, 1995; Newell, 1980a; 
Pylyshyn, 1984). 

(P3) Our theory construction is moti- 
vated by computational concepts and 
constrained by behavioral data. 

(P4) Computational processes are speci- 
fied independently of physics and can 
be realized in any physical system. 


COGNITIVE NEUROSCIENCE OF DEDUCTIVE REASONING 477 


(Ci) Tree tiaestebhy id ps Agaiiianeng co realizes the computational process become 


ple, that neurological data can constrain 
our computational/cognitive theories. 


A closer examination will reveal at least 
two flaws in the argument. First, premise P4 
is not strictly true. Computational processes 
cannot be realized in any and every system 
(Giunti, 1997; Goel, 1991, 1992, 1995). If 
it were true, then computational explana- 
tions would be vacuous (Searle, 1990) and 
our problems much more serious. Now, it is 
true that computational processes can be re- 
alized in multiple systems, but that is far re- 
moved from universal realizability. The for 
mer gives computational theorizing much of 
its power; the latter drains computational 
explanations of much of their substantive 
content. 

Second, the conclusion Ci depends on 
what computational/cognitive theories seek 
to explain. It is true that the organization 
of a computing mechanism (for example, 
whether a Turing Machine has one head or 
two) is irrelevant when we are interested in 
specifying what function is being computed 
and are concerned only with the mappings of 
inputs to outputs. This is typically a concern 
for mathematicians and logicians. If cogni- 
tive theories will only enumerate the func- 
tions being computed, then the argument 
would seem to hold. However, cognitive sci- 
entists (and often computer scientists) have 
little interest in computation under the as- 
pect of functions. Our primary concern is 
with the procedures that compute the func- 
tions (Marr, 1982). Real-time computation 
is a function of architectural considerations 
and resource availability and allocation. And 
it is real-time computation — the study of 
the behavioral consequences of different re- 
source allocation and organization models — 
that must be of interest to cognitive science 
(Newell, 1980a; Newell & Simon, 1976), be- 
cause it is only with respect to specific archi- 
tectures that algorithms can be specified and 
compared (to the extent that they can be). 
If we are interested in the computational ar- 
chitecture of the mind — and we clearly are 
(Newell, 1990; Pylyshyn, 1984) — then the 
constraints provided by the mechanism that 


very relevant. Presumably neuroscience is 
where we will learn about the architectural 
constraints imposed on the human cogni- 
tive/computational system. As such, it can 
hardly be ignored. 

But this whole line of argument and coun- 
terargument makes an unwarranted assump- 
tion. It assumes that the only contribu- 
tion neuroscience can make is in terms of 
specifying mechanisms. However, a glance 
through any neuroscience text (e.g., Kandel, 
Schwartz, & Jessell, 1995) shows that neu- 
roscience is still far from making substantive 
contributions to our understanding of the 
computational architecture of the central 
nervous system. This is many years in the 
future. 

There are, however, two more immedi- 
ate contributions — localization and dissocia- 
tion — that cognitive neuroscience can make 
to our understanding of cognitive processes, 
including reasoning. 


(1) Localization of brain functions: It is now 
generally accepted that Franz Joseph 
Gall (Forster, 1815) was largely right and 
Karl Lashley (1929) largely wrong about 
the organization of the brain. There is 
a degree of modularity in its overall 
organization. Over the years, neuropsy- 
chologists and neuroscientists have ac- 
cumulated some knowledge of this or 
ganization. For example, we know some 
brain regions are involved in processing 
language and other regions process vi- 
sual spatial information. Finding selec- 
tive involvement of these regions in com- 
plex cognitive tasks such as reasoning can 
help us differentiate between compet- 
ing cognitive theories that make differ- 
ent claims about linguistic and visuospa- 
tial processes in the complex task (as do 
mental logic and mental model theories 
of reasoning). 


(2) Dissociation of brain functions: Brain le- 
sions result in selective impairment of 
behavior. Such selective impairments are 
called dissociations. A single dissociation 
occurs when we find a case of a lesion in 
region x resulting in a deficit of function 
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case, in which a lesion in region y re- 
sults in a deficit in function b but not 
in function a, then we have a double dis- 
sociation. Recurrent patterns of dissocia- 
tion provide an indication of causal joints 
in the cognitive system invisible in un- 
interrupted normal behavioral measures 
(Shallice, 1988). Lesion studies identify 
systems necessary for the cognitive pro- 
cesses under consideration. Neuroimag- 
ing studies identify cortical regions suf- 
ficient for various cognitive processes.+ 
Both are sources of knowledge regarding 
dissociation of cognitive functions. 


Dau 


The identification of dissociations is the 
more important of these two contributions 
and warrants further discussion. Cognitive 
theories are functional theories. Functional 
theories are notoriously underconstrained. 
That is, they are “black box” theories. We 
usually use them when we do not know the 
underlying causal structure. This devalues 
the currency of functional distinctions. But 
if we can show that our functional distinc- 
tions map onto causally individuated neuro- 
physiological structures, then we can have 
much greater confidence in the functional 
individuation. 

By way of an example, suppose that we 
individuate the following three functions on 
the basis of behavioral data: (fi) raise left 
arm, (f2) raise left foot, (f3) wiggle right 
ear. If these functions could be mapped 
onto three causally differentiated structures 
in a one-to-one fashion, we would be jus- 
tified in claiming to have discovered three 
distinct functions. If however, all three of 
our behaviorally individuated functions map 
onto one causally differentiated structure, in 
a many-to-one fashion, we would say that 
our functional individuation was too fine- 
grained and collapse the distinctions until 
we achieved a one-to-one mapping. That is, 
raising the left arm does not constitute a 
distinct function from raising the left foot 
and wiggling the right ear, but the conjunc- 
tion of the three does constitute a single 
function. If we encountered the reverse sit- 


piianatymaion, in which one behavioral function 


mapped onto several causally distinct struc- 
tures, we would conclude that our indi- 
viduation was too coarse-grained and re- 
fine it until we achieved a one-to-one map- 
ping. One final possibility is a many-to-many 
mapping between our functional individua- 
tion and casually individuated physiological 
structures. Here we would have a total cross- 
classification and would have to assume that 
our functional individuations (fi, f2, f3) 
were simply wrong and start over again.° 
The most famous example of a dissoci- 
ation comes from the domain of language. 
In the 1860s, Paul Broca described a pa- 
tient with a lesion to the left posterior infe- 
rior frontal lobe who had difficulties in the 
production of speech but was quite capa- 
ble of speech comprehension. This is a case 
of a single dissociation. In the 1870s, Carl 
Wernicke described two patients with le- 
sions to the posterior regions of the supe- 
rior temporal gyrus who had difficulty in 
speech comprehension but were quite fluent 
in speech production. Jointly, the two ob- 
servations indicate a double dissociation and 
tell us something important about the causal 
independence of language production and 
comprehension systems. If this characteri- 
zation is accurate (and there are now some 
questions about its accuracy), it tells us that 
any cognitive theory of speech production 
and comprehension needs to postulate two 
causally distinct functions or mechanisms. 


Neuroanatomical Predictions of Cognitive 
Theories of Reasoning 


Given that the relevance of neuroanatomical 
data to cognitive theories has not been fully 
appreciated, it is not surprising that there 
are few explicit neuroanatomical predictions 
made by these theories. The one exception is 
mental model theory. Johnson-Laird (1994) 
has predicted that if mental model theory 
is correct, then reasoning must occur in the 
right hemisphere. The rationale here pre- 
sumably is that mental model theory offers a 
spatial hypothesis, and anecdotal neuropsy- 
chological evidence suggests that spatial pro- 
cessing occurs in the right hemisphere. A 


COGNITIVE NEUROSCIENCE OF DEDUCTIVE REASONING 479 


more acddratepretdhin ps4 EW FAGARINC OMsic PARADIGM AND STRATEGY 


theory would be that the neural structures 
for visuospatial processing contribute the ba- 
sic representational building blocks used for 
logical reasoning (i.e., the visuospatial sys- 
tem is necessary and sufficient for reason- 
ing). I will use the latter prediction. 

By contrast, mental logic theory is a lin- 
guistic hypothesis (Rips, 1994) and needs 
to predict that the neuroanatomical mecha- 
nisms of language (syntactic) processing un- 
derwrite human reasoning processes [i.e., 
that the language (syntactic) system is both 
necessary and sufficient for deductive rea- 
soning]. Both mental model and mental logic 
theories make explicit localization predic- 
tions (i.e, whether linguistic or visuospa- 
tial systems are involved) and implicit dis- 
sociation predictions — specifically, that the 
one system is necessary and sufficient for 
reasoning. 

Dual mechanism theory needs to predict 
the involvement of two different brain sys- 
tems in human reasoning, corresponding to 
and the formal, deliberate, rule-based sys- 
tem and the implicit, unschooled, automatic 
system. But it is difficult to make a pre- 
diction about localization without further 
specification of the nature of the two sys- 
tems. Nonetheless, dual mechanism theory 
makes a substantive prediction about a dis- 
sociation in the neural mechanisms underly- 
ing the two different forms of reasoning. 


Functional Anatomy of Reasoning 


My colleagues and I have been carrying out 
a series of studies to investigate the neural 
basis of logical reasoning (Goel et al., 2000; 
Goel & Dolan, 2000, 2001, 2003; Goel et al., 
1995, 1997, 1998). Our initial goal was to 
address the hypotheses made by the cogni- 
tive theories of reasoning and, in particu- 
lar, differentiate between mental logic and 
mental model theories. We have made some 
progress along these lines (although with 
surprising results) and have also provided 
insights into the role of prefrontal cortex 
(PFC) in logical reasoning. 


We have been presenting subjects with syllo- 
gisms, each consisting of two premises and a 
conclusion (e.g., All dogs are pets; All poo- 
dles are dogs; All poodles are pets), while 
they undergo positron emission tomography 
or functional magnetic resonance imaging 
(fMRI) brain scans and asking them to ex- 
hibit knowledge of what logically follows 
from the premises by confirming or denying 
the given conclusion. Our strategy has been 
to (largely) stay with one type of argument 
(syllogisms), manipulate content (holding 
the logically relevant information constant), 
and see how the brain reacts. The specific 
content manipulations are described in the 
studies discussed subsequently. 
Neuroimaging studies typically require 
a rest or baseline condition against which 
to compare the active condition. For our 
baseline tasks (in the fMRI studies) we used 
trials in which the first two sentences were 
related but the third sentence was unrelated 
(eg., All dogs are pets; All poodles are dogs; 
All fish are scaly). Stimuli were presented 
one sentence at a time with each sentence 
staying up until the end of the trial. Trials ap- 
peared randomly in an event-related design 
(Figure 20.1). The task in all trials was the 
same. Subjects were required to determine 
whether the conclusion followed logically 
from the premises (i.e., whether the argu- 
ment was valid). In baseline trials in which 
the first two sentences were related, subjects 
would begin to construct a representation of 
the problem, but when the third, unrelated, 
sentence appeared they would immediately 
disengage the task and respond “no.” In 
reasoning condition trials in which the three 
sentences constituted an argument, subjects 
would continue with the reasoning com- 
ponent of the task after the presentation of 
the third sentence. The difference between 
completing the reasoning task and disen- 
gaging after the presentation of the third 
sentence isolates the reasoning components 
of interest. The data were modeled after the 
presentation of the third sentence. The pre- 
sentation of the first two sentences and sub- 
jects’ motor responses were modeled out as 
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Figure 20.1. Stimuli presentation: Stimuli from all conditions were presented 
randomly in an event-related design. An “*” indicated the start of a trial at o 
milliseconds. The sentences appeared on the screen one at a time, with the first 
sentence appearing at 500 milliseconds, the second at 3500 milliseconds, and the 
last sentence at 6500 milliseconds. The duration of trials varied from 10.25 to 
14.35 seconds, leaving subjects 3.75 to 7.85 seconds to respond. 


events of no interest. This basic design was 
used in each of the imaging studies discuss- 
ed subsequently. 

We chose to use syllogisms (which test 
knowledge of quantification and negation) 
for technical reasons. Imaging studies re- 
quire multiple presentations of stimuli to 
register a reliable neural signal. Syllogisms 
come in 64 different forms and therefore 
allow for multiple trial presentations with 
minimal or no repetition of form. 

We chose to manipulate content because, 
logically, the content of an argument is irrel- 
evant to the determination of its validity. For 
example, the argument 


All men are mortal 
Socrates is a man 
Socrates is mortal 


is valid by virtue of the fact it has the follow- 
ing form: 


All A are B 
CisA 
Cis B 
It remains valid irrespective of whether it 


is about Socrates or elephants. Validity is a 
function of the logical structure of the ar- 


gument as opposed to the content of the 
sentences. 

However, it is well known that the seman- 
tic contents of arguments affect people’s va- 
lidity judgments. In a classic study, Wilkins 
(1928) showed that subjects performed bet- 
ter on syllogisms containing sentences with 
familiar semantic content (e.g., “All apples 
are red”) than on syllogisms lacking seman- 
tic content (e.g., “All A are B”). When the 
semantic content of syllogisms was incon- 
gruent with beliefs (e.g., “All apples are poi- 
sonous”), performance suffered even more. 
These results have been explored and ex- 
tended in more recent literature (Cherubini 
et al., 1998; Evans, Barston, & Pollard, 
1983; Oakhill & Garnham, 1993; Oakhill, 
Johnson-Laird, & Garnham, 1989). The ef- 
fect is very robust and has challenged all the- 
ories of reasoning. 

We discuss our key findings subsequently. 
They include: (i) a dissociation between a 
frontal-temporal system and a parietal sys- 
tem as a function of the familiarity of the 
content of the reasoning material; (ii) asym- 
metrical involvement of right and left PFC, 
with the left PFC being necessary and some- 
times sufficient, and the right PFC being 
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Figure 20.2. Main effect of reasoning [(content reasoning + no content 
reasoning) — (content preparation + no content preparation)] revealed activation 
of bilateral cerebellum (R > L), bilateral fusiform gyrus, left superior parietal 
lobe, left middle temporal gyrus, bilateral inferior frontal gyrus, bilateral basal 
ganglia nuclei (centered around the accumbens, caudate nucleus, and putamen), 
and brain stem. Reprinted with permission from Goel (2003). 


sometimes necessary (in unfamiliar, inco- 
herent, conflicting situations) but not suff- 
cient for logical reasoning; and (iii) clarifying 
roles of right PFC and ventral medial PFC 
(VMPFC) in belief-logic conflict resolution. 


BASIC FINDINGS 


Dissociable neural networks. In Goel et al. 
(2000), we scanned eleven right-handed, 
normal subjects using event-related {MRI to 
measure task-related neural activity while 
they engaged in syllogistic reasoning. The 
study was designed to manipulate the pres- 
ence of content in logical reasoning. Half 
the arguments contained content sentences, 
such as 


All dogs are pets 
All poodles are dogs 
All poodles are pets 


and the other half contained “no content” 
versions of these sentences, such as 


All P are B 
All C are P 
All C are B 


The logically relevant information in both 
conditions was identical. Half the argu- 
ments were valid, and the other half 
were invalid. 

If mental model theory is correct, all rea- 
soning trials should activate a visuospatial 
system (perhaps parietal cortex). If men- 
tal logic theory is correct, we would ex- 
pect activation of the language system (left 
frontal and temporal lobe regions). Dual 
mechanism theory predicts engagement of 
two distinct (but unspecified) neural sys- 
tems, as determined by whether subjects re- 
spond in a schooled, formal manner or an 
intuitive, implicit manner. What we actu- 
ally found was that the main effect of rea- 
soning implicated large areas of the brain 
(Figure 20.2), including regions predicted 
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theories. 

However, closer examination revealed 
this to be a composite activation consist- 
ing of two dissociable neural systems. The 
content reasoning trials compared with no- 
content reasoning trials revealed activa- 
tion in left middle and superior temporal 
lobe (BA 21/22), left temporal pole (BA 
21/38), and left inferior frontal lobe (BA 47) 
(Figure 20.3a). This is essentially a language 
and memory system. A similar network was 
activated in previous PET studies of de- 
ductive reasoning using contentful sentences 
(Goel et al., 1997, 1998). 

The reverse comparison of no-content 
reasoning trials versus content reasoning tri- 
als resulted in activation of bilateral occipital 
(BA 19), bilateral superior and inferior pari- 
etal lobes (BA 7), and bilateral dorsal (BA 6) 
and inferior (BA 44) frontal lobes (Figure 
20.3b). This pattern of activation is known 
to be involved in the internal representa- 
tion and manipulation of spatial information 
(Jonides et al., 1993; Kosslyn et al., 1989) 
and is very similar to that reported for tran- 
sitive inference involving geometrical shapes 
(Acuna et al., 2002) and certain types of 
mathematical reasoning involving approx- 
imation of numerical quantities (Dehaene 
et al., 1999). 

It is possible to argue that the patterns 
of activation revealed by the direct compar- 
ison of content and no-content conditions 
are just a function of the presence or ab- 
sence of content words, rather than being in- 
dicative of different reasoning mechanisms. 
To exclude this possibility, we examined the 
Content (content, no content) by Task (rea- 
soning, baseline) interaction. The modula- 
tion of reasoning, by the addition of content 
([content reasoning — content baseline] — 
[no-content reasoning — no-content base- 
line]) revealed activation in Wernicke’s area. 
The reverse interaction, which examined the 
effect of the absence of semantic content 
([no-content reasoning — no-content base- 
line] — [content reasoning — content base- 
line]), activated left parietal cortex. This 
interaction analysis eliminates the aforemen- 
tioned possibility and confirms the involve- 


process. 

Contrary to mental logic theories that 
predict the language (syntactic) system is 
necessary and sufficient for deductive rea- 
soning and mental model theories that pre- 
dict the visuospatial system is necessary and 
sufficient for logical reasoning, Goel et al. 
(2000) found evidence for the engagement 
of both systems. The presence of semantic 
content engages the language and long-term 
memory systems in the reasoning process. 
The absence of semantic content engages the 
visuospatial system in the logically identical 
reasoning task. Before discussing the impli- 
cations of these results for cognitive theo- 
ries, let us consider some additional issues 
and data. 

The Goel et al. (2000) study raises several 
interesting questions, one of which has to 
do with the involvement of a parietal visual- 
spatial system in the no-content or abstract 
syllogism condition. A second question has 
to do with the exact property of the stimuli 
that leads to the modulation of neural ac- 
tivity between frontal-temporal and parietal 
systems. Pursuing the first question led to a 
clarification of the second question. 

The first question is whether argument 
forms involving three-term spatial relations 
such as: 


The apples are in the barrel 
The barrel is in the barn 
The apples are in the barn 


and 


A are in B 

Bisin C 

A are in C 
are sufficient to engage the parietal sys- 
tem irrespective of the presence of content? 
One rationale for thinking this might be the 
case is subjects’ reported phenomenological 
experience of using a visuospatial strategy 
during these tasks. Secondly, neuroimaging 
studies have also shown the involvement of 
the parietal system in the encoding of re- 
lational spatial information (Laeng, 1994; 
Mellet et al., 1996). To address this ques- 
tion, we carried out another fMRI study, this 
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Figure 20.3. (a) The content reasoning—no-content reasoning comparison 
revealed activation of the left middle / superior temporal lobe (BA 21/22), the 
left inferior frontal lobe (BA 47), and bilateral (BA 17) and lingual gyri (BA 18). 
(b) The no-content reasoning—content reasoning comparison revealed 
activation of (a) bilateral occipital (BA 18, 19) and (c) bilateral superior and 
inferior parietal lobes (BA 7, 40), bilateral precentral gyrus (BA 6), and 
bilateral middle frontal gyrus (BA 6). Reprinted from Goel et al. (2000) with 


permission from Elsevier. 


time using three-term relational arguments 
like those mentioned previously (Goel & 
Dolan, 2001). 

Goel and Dolan (2001) found that reason- 
ing about abstract and concrete three-term 
relations, as in the aforementioned exam- 
ples, recruited a bilateral parietal-occipital 
system with greater involvement of pari- 
etal and occipital lobes in the abstract con- 
dition compared with the concrete con- 
dition. There was an absence of the two 
dissociable networks for concrete and ab- 
stract reasoning reported in the first study. 
In particular, the temporal lobe (BA 21/22) 
activation evident in concrete syllogistic rea- 
soning in the first study was conspicuously 
absent in this study. One explanation for the 


lack of temporal lobe (BA 21/22) activation 
in Goel and Dolan (2001) might be the na- 
ture of the content used in the two stud- 
ies. The concrete sentences in Goel et al. 
(2000) were of the form “All apples are poi- 
sonous” whereas the concrete sentences in 
Goel & Dolan (2001) were of the form “John 
is to the right of Mary.” The former sentence 
types predicate known properties to known 
objects. We have beliefs about whether “all 
apples are poisonous.” By contrast, the latter 
sentence types do not allow for such beliefs.° 
This leaves open the interesting possibility 
that involvement of BA 21/22 in reasoning 
may be specific to content processing in- 
volving belief networks rather than just con- 
crete contents. 
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Dolan (2003) in which subjects were pre- 
sented with arguments, such as 


No reptiles are hairy 
Some elephants are hairy 
No elephants are reptiles 


containing sentences that subjects could be 
expected to have beliefs about, and belief- 
neutral arguments, such as 


No codes are highly complex 
Some quipu are highly complex 
No quipu are codes 


containing sentences that subjects may not 
have beliefs about (because they may not 
know the meaning of one or more key 
terms). The referential terms in the two con- 
ditions were counterbalanced for abstract 
and concrete categories. 

The results of this study replicated and 
clarified the results of Goel et al. (2000). 
Modulation of the reasoning task by absence 
of belief [(belief-neutral reasoning — belief- 
neutral baseline) — (belief-laden reasoning — 
belief-laden baseline) ] revealed activation in 
the left superior parietal lobe (BA 7) unique 
to the belief-neutral condition. The re- 
verse modulation [(belief-laden reasoning — 
belief-laden baseline) — (belief-neutral rea- 
soning — belief-neutral baseline)] revealed 
activation of anterior left middle temporal 
gyrus (BA 21) unique to the belief-bias con- 
dition. These results confirm that a critical 
(sufficient) factor in the modulation of ac- 
tivity between these two neural systems is 
the presence of familiar or belief-laden con- 
tent in the reasoning processes. 


GENERALIZATION OF DISSOCIATION TO 
‘TRANSITIVE REASONING 

We have demonstrated dual pathways for 
reasoning about categorical syllogisms. The 
question arises whether the results general- 
ize to other forms of logical reasoning, par- 
ticularly three-term spatial relations, where 
one might think the visuospatial system to 
be sufficient. To answer this question, Goel, 
Makale, and Grafman (2004) studied 14 vol- 
unteers using event-related fMRI, as they 


unfamiliar environments. 
Half the arguments contained sentences, 
such as 


Paris is south of London 
London is south of Edinburgh 
Paris is south of Edinburgh 


describing environments with which sub- 
jects would be familiar (as confirmed by a 
post-scan questionnaire), whereas the other 
half contained sentences, such as 


The AI lab is south of the Roth Centre 
Roth Centre is south of Cedar Hall 
Al lab is south of Cedar Hall 


that subjects could not be familiar with be- 
cause they describe a fictional, unknown 
environment. 

Our main finding was an interaction be- 
tween Task (reasoning and baseline) and 
Spatial Content (familiar and unfamiliar). 
Modulation of reasoning regarding unfamil- 
iar landmarks resulted in bilateral activation 
of superior and inferior parietal lobule (BA 
7, 40), dorsal superior frontal cortex (BA 6), 
and right superior and middle frontal gyri 
(BA 8) regions frequently implicated in 
visuospatial processing. By contrast, mod- 
ulation of the reasoning task involving fa- 
miliar landmarks engaged right inferior and 
orbital frontal gyrus (BA 11/47), bilateral 
occipital (BA 18, 19), and temporal lobes. 
The temporal lobe activation included right 
inferior temporal gyrus (BA 37), posterior 
hippocampus, and parahippocampal gyrus 
regions implicated in spatial memory and 
navigation tasks. These results provide sup- 
port for the generalization of our dual mech- 
anism account of transitive reasoning and 
highlight the importance of the hippocam- 
pal system in reasoning about landmarks in 
familiar spatial environments. 


EVIDENCE FOR DISSOCIATION FROM PATIENT DATA 


If we are correct that reasoning involving fa- 
miliar situations engages a frontal-temporal 
lobe system and formally identical reasoning 
tasks involving unfamiliar situations recruit a 
frontal-parietal visuospatial network — with 
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mer than the latter — then frontal lobe le- 
sion patients should be more impaired on 
reasoning about familiar situations than on 
unfamiliar situations. To test this hypoth- 
esis, Goel et al. (2004) administered the 
Wason 4-Card Selection Task (Wason, 1966) 
to 19 frontal lobe patients and 19 age- and 
education-matched normal controls. 

Wason 4-Card Selection Task (WST) 
(Wason, 1966) is the most widely used task 
to explore the role of content in reasoning. 
Subjects are shown four cards. They can see 
what is printed on one side of each card, but 
not the other side. They are given a rule of 
the form: if p then q (eg., “If a card has 
a vowel on one side, it must have an even 
number on the other side.”) and asked which 
cards they must turn over in order to verify 
the rule. The visible values on the cards cor 
respond to the p, not-p, q, and not-q cases 
of the rule. According to standard proposi- 
tional logic, the correct choices are p (to ver- 
ify q is on the other side) and not-q (to verify 
pisnot on the other side). Given an arbitrary 
rule like the above, typically fewer than 25% 
of normal subjects will turn over both the p 
and the not-q cards. However, the introduc- 
tion of familiar, meaningful content in a rule 
(e.g., “If anyone is drinking beer, then that 
person must be over 21 years old.”) greatly 
facilitates performance (Cheng & Holyoak, 
1985; Cosmides, 1989; Cox & Griggs, 1982; 
Gigerenzer & Hug, 1992; Griggs & Cox, 
1982; Wason & Shapiro, 1971). 

Specifically, we manipulated the social 
knowledge involved in the task in the form 
of “permission schemas” (Cheng & Holyoak, 
1985). Subjects performed the task with an 
arbitrary rule condition (“If a card has an A 
on one side, then it must have a 4 on the 
other side.”), an abstract permission condi- 
tion (“If one is to take action A, then one 
must first satisfy precondition P.”), and a 
concrete permission condition (“If a person 
is to drink alcohol, he or she must be at 
least 21.”). 

The principal findings were that, in the 
purely logical (arbitrary rule) condition, 
frontal lobe patients performed just as well 
(or just as poorly) as normal controls. How- 


with the introduction of social knowledge in 
the form of abstract or concrete permission 
schemas as did normal control performance. 
Furthermore, there was no significant corre- 
lation between volume loss, IQ scores, mem- 
ory scores, or years of education and perfor- 
mance in the abstract or concrete permission 
schema conditions. The failure of patients to 
benefit from social knowledge therefore can- 
not be explained in terms of volume loss, IQ 
scores, memory scores, or years of education. 

Consistent with the neuroimaging data, 
our interpretation is that the arbitrary rule 
condition of the WST involves greater acti- 
vation of the parietal lobe system, whereas 
the permission schema trials result in greater 
engagement of a frontal-temporal lobe 
system. The normal controls have both 
mechanisms intact and can take advantage 
of social knowledge cues to facilitate the rea- 
soning process. The patients’ parietal system 
is intact, and so their performance on the 
arbitrary rule trial is the same as that of nor- 
mal controls. Their frontal lobe system is 
disrupted, preventing them from taking ad- 
vantage of social knowledge cues in the per- 
mission schema trials.” 


HEMISPHERIC ASYMMETRY 


Our imaging studies have also revealed an 
asymmetry in frontal lobe involvement in 
logical reasoning. Reasoning about belief- 
laden material (e.g., All dogs are pets; All 
poodles are dogs; All poodles are pets) ac- 
tivates left prefrontal cortex (Figure 20.4a), 
whereas reasoning about belief-neutral ma- 
terial (eg., All A are B; All C are A; All 
C are B) activates bilateral prefrontal cor- 
tex (Figure 20.4b) (Goel et al., 2000; Goel 
& Dolan, 2003). This asymmetry shows up 
consistently in patient data. 

Caramazza et al. (1976) administered 
two-term problems such as the following: 
“Mike is taller than George. Who is taller?” 
to brain-damaged patients. They reported 
that left hemisphere lesion patients were im- 
paired in all forms of the problem, but right 
hemisphere lesion patients were impaired 
only when the form of the question 
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Figure 20.4. (a) Reasoning involving familiar conceptual content activates left 
inferior prefrontal cortex. (b) Reasoning involving unfamiliar content activates 
bilateral prefrontal cortex. (c) Right prefrontal cortex mediates belief—logic 
conflict detection and/or resolution. Reprinted from Goel et al. (2000) with 
permission from Elsevier. 


was incongruent with the premise 
(e.g., Who is shorter?). Read (1981) tested 
temporal lobectomy patients on three-term 
relational problems with semantic content 
(eg., George is taller than Mary; Mary is 
taller than Carol; Who is tallest?). Subjects 
were told that using a mental imagery 
strategy would help them to solve these 
problems. He reported that left temporal 
lobectomy patient performance was more 
impaired than right temporal lobectomy 
patient performance. In a more recent 
study using matched verbal and spatial 
reasoning tasks, Langdon and Warrington 
(2000) found that only left hemisphere 
lesion patients failed the verbal section, 


and both left and right hemisphere lesion 
patients failed the spatial sections. They 
concluded by emphasizing the critical role 
of the left hemisphere in both verbal and 
spatial logical reasoning. 

In the WST patient study discussed previ- 
ously (Goel et al., 2004), not only was it the 
case that frontal lobe patients failed to bene- 
fit from the introduction of familiar content 
into the task, but the result was driven by the 
poor performance of left hemisphere lesion 
patients. There was no difference in perfor- 
mance between right hemisphere lesion pa- 
tients and normal controls but only between 
left hemisphere lesion patients and controls. 
These data show that the left hemisphere 
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Figure 20.5. (a) Correct inhibitory trials activate right prefrontal cortex. (b) 
Incorrect inhibitory trials activate VMPFC cortex. Reprinted from Goel & Dolan 


2003) with permission from Elsevier. 


is necessary and often sufficient for reason- 
ing whereas the right hemisphere is some- 
times necessary but not sufficient. [This is of 
course contrary to the Johnson-Laird (1994) 
prediction for mental model theory, but, as 
noted previously, we chose to modify this 
prediction to make it consistent with basic 
neuropsychological knowledge. ] 


DEALING WITH BELIEF—LOGIC CONFLICTS 


Although from a strictly logical point of 
view, deduction is a closed system, we have 
already mentioned that beliefs about the 
conclusion of an argument influence peo- 
ple’s validity judgments (Wilkins, 1928). 
When arguments have familiar content, the 
truth value (or believability) of a given con- 
clusion will be consistent or inconsistent 
with the logical judgment. Subjects per 
form better on syllogistic reasoning tasks 
when the truth value of a conclusion (true 
or false) coincides with the logical rela- 
tionship between premises and conclusion 
(valid or invalid) (Evans, Barston, & Pol- 
lard, 1983). Such trials are facilitatory to 
the logical task and consist of valid ar 
guments with believable conclusions (e.g., 
Some children are not Canadians; All chil- 
dren are people; Some people are not Cana- 
dians) and invalid arguments with unbeliev- 
able conclusions (e.g., Some violinists are 
not mutes; No opera singers are violinists; 
Some opera singers are mutes). When the 
logical conclusion is inconsistent with sub- 


jects’ beliefs about the world, the beliefs 
are inhibitory to the logical task and de- 
crease accuracy (Evans, Barston, & Pollard, 
1983). Inhibitory belief trials consist of valid 
arguments with unbelievable conclusions 
(e.g., No harmful substances are natural; All 
poisons are natural; No poisons are harmful) 
and invalid arguments with believable con- 
clusions (e.g., All calculators are machines; 
All computers are calculators; Some ma- 
chines are not computers). Performance on 
arguments that are belief-neutral usually 
falls between these two extremes (Evans, 
Handley, & Harper, 2001). 

Goel et al. (2000) noted that when log- 
ical arguments result in a belief-logic con- 
flict, the nature of the reasoning process is 
changed by the recruitment of the right lat- 
eral prefrontal cortex (Figure 20.4c). Goel 
and Dolan (2003) further noted that, within 
the inhibitory belief trials, a comparison of 
correct items with incorrect items (correct 
inhibitory belief trials — incorrect inhibitory 
belief trials) revealed activation of right in- 
ferior prefrontal cortex (Figure 20.5a). The 
reverse comparison of incorrect response tri- 
als with correct response trials (incorrect 
inhibitory belief trials — correct inhibitory 
belief trials) revealed activation of VMPFC 
(Figure 20.5b). 

Within the inhibitory belief trials, the pre- 
potent response is associated with belief- 
bias. Correct responses (in inhibitory trials) 
indicate that subjects detected the conflict 
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ence, inhibited the prepotent response asso- 
ciated with the belief bias, and engaged the 
reasoning mechanism. Incorrect responses in 
such trials indicate that subjects failed to de- 
tect the conflict between their beliefs and 
the logical inference and/or inhibit the pre- 
potent response associated with the belief 
bias. Their response is biased by their beliefs. 
The involvement of right prefrontal cortex 
in correct response trials is critical in detect- 
ing and/or resolving the conflict between be- 
lief and logic. Such a role of the right lat- 
eral prefrontal cortex was also noted in Goel 
et al. (2000) and in a study of maintenance 
of an intention in the face of conflict be- 
tween action and sensory feedback (Fink 
et al., 1999). A similar phenomenon has 
been noted in the Caramazza et al. (1976) 
study mentioned previously in which right 
hemisphere lesion patients were impaired 
only when there was an incongruency in the 
form of the question and the premises. By 
contrast, the activation of VMPFC in incor- 
rect trials highlights its role in nonlogical, 
belief-based responses. 


Conclusions and Future Directions 


Consequences for Cognitive Theories 
of Reasoning 


We now briefly address the question of how 
these data map onto the cognitive theories of 
reasoning with which we began our discus- 
sion. This is a complex question because the 
data do not fit neatly with any of the three 
theories. First and foremost, we show a dis- 
sociation in mechanisms involved in belief- 
neutral and belief-laden reasoning. The two 
systems we have identified are roughly the 
language system and the visuospatial system, 
which is what mental logic theory and men- 
tal model theory respectively predict. How- 
ever, neither theory anticipates this dissocia- 
tion. Each theory predicts that the system it 
postulates is necessary and sufficient for rea- 
soning. This implies that the neuroanatom- 
ical data cross-classify these cognitive theo- 
ries. A further complication is that mental 


ponent of language in logical reasoning. Our 
studies activate both the syntactic and se- 
mantic systems and components of long- 
term memory. 

Our results do seem compatible with 
some form of dual mechanism theory, which 
explicitly predicts a dissociation. However, 
as noted, this theory comes in various fla- 
vors and some advocates may not be keen 
to accept our conclusions. The distinction 
that our results point to is between rea- 
soning with familiar, conceptually coher- 
ent material versus unfamiliar, nonconcep- 
tual, or incoherent material. The former 
engages a left frontal-temporal system (lan- 
guage and long-term memory) whereas the 
latter engages a bilateral parietal (visuospa- 
tial) system. Given the primacy of belief 
bias over effortful thinking (Sloman, 1996), 
we believe that the frontal-temporal system 
is more “basic” and effortlessly engaged. It 
has temporal priority. By contrast, the pari- 
etal system is effortfully engaged when the 
frontal-temporal route is blocked because of 
a lack of familiar content, or when a conflict 
is detected between the logical response and 
belief bias. 

This is very consistent with the dual 
mechanism account developed by Newell & 
Simon (i972) for the domain of problem 
solving. On this formulation, our frontal- 
temporal system corresponds to the “heuris- 
tic” system whereas the parietal system 
corresponds to the “universal” system. Rea- 
soning about familiar situations automati- 
cally utilizes situation-specific heuristics that 
are based on background knowledge and ex- 
perience. When no such heuristics are avail- 
able (as in reasoning about unfamiliar sit- 
uations), universal (formal) methods must 
be used to solve the problem. In the case of 
syllogistic reasoning, this may well involve a 
visuospatial system. 

Our results go beyond addressing cogni- 
tive theories of reasoning and provide new 
insight into the role of the prefrontal cor- 
tex in human reasoning. In particular, the 
involvement of the prefrontal cortex in log- 
ical reasoning is selective and asymmetric. 
Its engagement is greater in reasoning about 
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miliar, content-sparse situations. The left 
prefrontal cortex is necessary and often suffi- 
cient for reasoning. The right prefrontal cor- 
tex is sometimes necessary but not sufficient 
for reasoning. It is engaged in the absence of 
conceptual content and in the face of con- 
flicting or conceptually incoherent content 
(as in the belief-logic conflicts discussed pre- 
viously). Finally, the VMPFC is engaged by 
nonlogical, belief-biased responses. 


Future Directions 


Although some progress has been made over 
the past eight years, the cognitive neuro- 
science of reasoning is in its infancy. The next 
decade should be an exciting time of rapid 
development. There are a number of issues 
that we see as particularly compelling for 
further investigation. The first is how well 
the results can be generalized. Will the re- 
sults regarding syllogisms, which are quite 
difficult, generalize to basic low-level infer- 
ences such as modus ponens and modus 
tollens? Second, all the imaging studies to 
date have utilized a paradigm involving the 
recognition of a given conclusion as valid or 
invalid. It remains to be seen whether the 
generation of a conclusion would involve the 
same mechanisms. Third, given the involve- 
ment of visuospatial processing systems in 
much of reasoning and the postulated differ- 
ences between males and females in process- 
ing spatial information (Jones, Braithwaite, 
& Healy, 2003), one might expect neural- 
level differences in reasoning between the 
genders. Fourth, the issue of task difficulty 
has not been explored. As reasoning trials 
become more difficult, are additional neural 
resources recruited, or are the same struc- 
tures activated more intensely? Fifth, what 
is the effect of learning on the neural mech- 
anisms underlying reasoning? Sixth, most 
imaging studies to date have focused on de- 
duction. Although deduction is interesting, 
much of human reasoning actually involves 
induction. The relationship between the two 
at the neural level is still an open question. 
Finally, reasoning does not occur in a vac- 
uum. Returning to the example of my chil- 


son, “If you want dinner tonight, you bet- 
ter stop tormenting your sister” in a calm, 
unconcerned voice, it usually has an effect. 
However, if I state the same proposition in 
an angry, threatening voice, the impact is 
much more complete and immediate. Given 
that the logic of the inference is identical in 
the two cases, the emotions introduced into 
the situation through the modulation of my 
voice are clearly contributing to the result- 
ing behavior. In fact, emotions can be intro- 
duced into the reasoning process in at least 
three ways: (i) in the content or substance of 
the reasoning material; (ii) in the presenta- 
tion of the content of the reasoning material 
(as in voice intonation); and (iii) in the pre- 
existing mood of the reasoning agent. We are 
currently channeling much of our research 
efforts to understanding the neural basis of 
the interaction between emotions and ratio- 
nal thought. 
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Notes 


1. Whether there is any substantive difference 
between “knowing the inferential role” and 
“knowing the meaning” of the closed-form 
terms is a moot point, debated in the litera- 
ture. 


2. See Newell (1980b) for a discussion of the re- 
lationship between search and inference. 

3. The Church-Turing hypothesis makes the con- 
jecture that all computable functions belong to 
the class of functions computable by a Turing 
Machine. So if we constrain the class of func- 
tions called for by our psychological theories to 
the class of computable functions, then there 
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the function. 


4. These are, of course, logical claims about neu- 
roimaging and lesion studies. As in all empirical 
work, there are a number of complicating fac- 
tors, including the relationship between statis- 
tical significance (or insignificance) and reality 
of an observed effect. 

5. Again, I am making a logical point, indepen- 
dent of the usual complexities of mapping be- 
havior onto causal mechanisms. 

6. It is possible to generate relational sentences 
one can have beliefs about; for example, 
“London is north of Rome” or “Granite is 
harder than diamonds.” 

7. See also Bachman and Cannon, Chap. 21, for 
further discussion of disrupted thinking in pa- 
tient populations. 
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CHAPTER 21 


Cognitive and Neuroscience 
Aspects of Thought Disorder 


Peter Bachman 
Tyrone D. Cannon 


Introduction 


During the course of assessments carried out 
at two clinical research centers, the follow- 
ing responses were provided to standardized 
questions intended to be open-ended and to 
elicit relatively abstract responses: 


[Examiner A] “Can you explain the 
proverb, ‘Speech is the picture of the 
mind’?” 


[Subject A] “You see the world through 
speech. Like my grandfather used to speak 
to me of Alaskans and Alsatians and blood 
getting thicker and thinner in the Eskimo. 
He was against the Kents in England. I 
can’t smoke a Kent cigarette to this day” 
(Harrow & Quinlan, 1985, p. 44). 


[Examiner B] “Why should people pay 
taxes?” 


[Subject B] “Taxes is an obligation as 
citizens of our country. To our nation, to 
this country, the United States. As a citi- 
zen, I think we have an obligation. I think 
that’s carried to an extreme. Within reason, 
taxes within reason. Taxation, we have 
representation, so therefore we have taxa- 


tion. For we formed our constitution, it was 
taxation without representation is treason” 
(Johnston & Holzman, 1979, p. 263). 


Reading these two responses, and imag- 
ining hearing them aloud in conversation, 
almost surely evokes the feeling that some- 
thing is not quite right — the statements are 
somehow disordered. For instance, the ob- 
jects referred to in the first response seem to 
be related to each other only indirectly and 
along varying linguistic dimensions. Conse- 
quently, by the end of the statement, the re- 
sponse deviates dramatically from the con- 
tent requested by the examiner. The reply 
to the second examiner’s question does not 
follow such a rapidly digressing course but 
instead seems to fixate on an idea, or perhaps 
a phrase (“taxation without representation”) 
indirectly related to the content of the ques- 
tion, seeming to repeat and elaborate on that 
phrase without offering additional ideas. 

Apart from describing how these state- 
ments are disordered, understanding why 
the speakers produced them in such a 
manner is a daunting task. The process 
of comprehending and generating speech 
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involves numerous interrelated cognitive 
mechanisms (Levelt, 1989), any or all of 
which could contribute to abnormal speech 
comprehension or production. Moreover, 
thought disorder tends to occur within the 
context of a more extensive psychopathol- 
ogy (Andreasen & Grove, 1986), including 
diagnoses as diverse as schizophrenia, mood 
disorders, certain personality disorders, and 
autism (American Psychiatric Association, 
2000; Andreasen & Grove, 1986). In fact, 
the patient quoted in the first reply was di- 
agnosed with schizophrenia, the condition 
perhaps most closely associated with the 
presence of thought disorder (e.g., Bleuler, 
1911/1950). The second quote, however, was 
provided by a individual who was not her- 
self diagnosed with a psychiatric disorder but 
who has a daughter diagnosed with schizoaf- 
fective disorder (a condition thought to be 
closely related to schizophrenia; American 
Psychiatric Association, 2000), highlighting 
the role of heritable factors contributing to 
significant symptom expression even in the 
absence of a clear diagnostic label. 

Despite the prevalence of thought dis- 
order across diagnostic populations, sys- 
tematic efforts to study the pathology of 
thought disorder across diverse conditions, 
looking for common disease mechanisms, 
are rare. Rather, most investigators have 
chosen to study thought disorder strictly 
within the context of a particular dis- 
ease entity such as schizophrenia. Unfortu- 
nately, the multitude of difficulties many 
schizophrenia patients face — including de- 
graded information processing capabilities; 
presence of debilitating symptoms (includ- 
ing hallucinations and delusions); medica- 
tion side-effects; social and occupational 
morbidity; stressful relationships with rel- 
atives, who may themselves be burdened 
by psychiatric disorders — defies models 
of etiology based on a single underlying 
deficit. In recognition of this complexity, 
psychopathology researchers have begun to 
dissect disorders whose manifestation coin- 
cides with the presentation of thought dis- 
order into more fundamental neurocognitive 
traits that participate in symptom formation. 


demonstrated, for instance, that certain neu- 
rocognitive disruptions in schizophrenia are 
associated with genetic vulnerability to the 
illness, whereas other traits are associated 
with disease expression (e.g., Cannon et al., 
2000; Cannon et al., 2002). 

This more complex, integrative view 
of the pathology of thought disorder en- 
dows the investigator with a more powerful 
heuristic for grappling with the multitude 
of intertwined cognitive domains and lev- 
els of analysis active in the study of thought 
and how thought might come to be disor- 
dered in particular disease states (Cannon & 
Rosso, 2002). Perhaps, for instance, the in- 
tegration of behavioral genetic and experi- 
mental psychopathology approaches might 
yield the finding that deficits in two infor 
mation processing systems may — on their 
own — be necessary but not sufficient for the 
phenotypic manifestation of thought disor- 
der, whereas the coincidence of these deficits 
may result in its overt expression. 

In applying this framework to a larger dis- 
cussion of thought disorder, we attempt to 
elucidate a set of neurobiological and cog- 
nitive conditions that may participate in the 
generation of thought disorder through their 
collective action. More specifically, we focus 
first on the expression of thought disorder 
in psychopathology, highlighting descriptive 
approaches. Subsequently, we shift to dis- 
cussion of a prominent model of speech 
production, as well as two models of dis- 
ordered thinking in schizophrenia, to help 
us identify cognitive mechanisms likely dis- 
rupted in individuals displaying thought dis- 
order. Finally, we attempt to integrate find- 
ings from distinct levels of analysis (e.g., 
behavioral and molecular genetics, structural 
and functional neuroanantomy, behavioral 
performance) to characterize diverse aspects 
of psychiatric disorder as traits specific to 
disease expression, which we characterize as 
involving abnormal activation of the brain’s 
temporal lobe structures critical to the for- 
mation and retrieval of long-term memo- 
ries and other types of concrete information 
and also involving traits specific to genetic 
vulnerability, which tend to involve more 
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online maintenance and manipulation of in- 
formation (also see Morrison, Chap. 19). 


Defining Thought Disorder 


Perhaps the most common usage of the term 
“thought disorder,” at least within clinical 
settings, is as shorthand for “formal thought 
disorder,” which refers to a taxonomy of 
symptoms involving abnormal speech (An- 
dreasen, 1979, 1982). In this usage, thought 
disorder is typically conceptualized as the 
product of a loosening of associations lead- 
ing to a loss of continuity between or 
dered elements inferred to underlie a spoken 
utterance (Maher, 1991). The “formal” 
distinction, specifically with respect to 
schizophrenia, harkens back to the notion 
that pathologies of thought can be charac- 
terized as disorders of thought content or 
as disorders of thought form. The former of 
the two categories refers primarily to hal- 
lucinations, or well-defined percepts gener- 
ated endogenously but experienced as, and 
attributed to, exogenous events and to delu- 
sions — objectively false and often bizarre 
beliefs held with a high level of conviction 
(i.e., the patient maintains the belief in the 
face of counter-evidence; American Psychi- 
atric Association, 2000). A common exam- 
ple of a delusion is the belief that people in 
the patient’s environment have the intention 
of monitoring and even harming the patient 
(i.e., a paranoid delusion).' The latter cate- 
gory, disorders of thought form, involves a 
disorganization of underlying thought pro- 
cesses indicated by abnormal speech such 
as that quoted at the outset of this chapter. 
Factor analytic studies of symptom preva- 
lence in schizophrenia (e.g., Liddle, 1987) 
have generally supported this form-versus- 
content distinction, for ratings of formal 
thought disorder tend to covary with ratings 
of disorganized behavior on a factor includ- 
ing neither delusions nor hallucinations, sug- 
gesting that thought disorder symptoms in- 
deed reflect a disorganization of ideational 
elements not necessarily specific to articu- 
lated speech. 


to our description of formal thought disor 
der as a result of ideational disorganization. 
A thought-disordered individual might pro- 
duce a neologism or a novel word formed 
by the unique integration of parts of other 
words. A neologism would therefore be 
conceptualized as the loosening of nor 
mal associative relationships between in- 
dividual word parts (perhaps at the level 
of grammatical encoding, discussed sub- 
sequently). Johnston and Holzman (1979, 
p. 100) quoted one patient as responding to 
an examiner’s request to define the word 
“remorse” by replying, “Moisterous, being 
moistful,” combining legal word parts to form 
lexically invalid words. 

Similarly, an affected individual’s speech 
may be characterized by lexically valid but 
unrelated words strung together to make an 
unintelligible statement — a loosening of as- 
sociations between words. An example of 
this type of disordered comment is, “If things 
turn by rotation of agriculture or levels or 
timed in regard to everything...” (Maher, 
1966, p. 395). Inits extreme form, clinicians 
sometimes refer to this type of disorgani- 
zation as “word salad,” indicating its highly 
jumbled presentation. 

Formal thought disorder may also man- 
ifest itself in an abrupt shift between in- 
directly related topics, representing a loos- 
ened association between ideas or clauses 
within or between sentences. For example, 
when one patient with formal thought dis- 
order was asked to explain why people who 
are born deaf are usually unable to talk, 
he replied, “When swallow in your throat 
like a key it comes out, but not as scissors. 
A robin, too, it means spring” (Harrow & 
Quinlan, 1985, p. 429). In this instance, the 
patient seems to have switched from em- 
ploying one meaning of the word swallow 
(i.e., the verb) to an alternate meaning (i.e., 
the type of bird) and then articulating a con- 
cept (i.e., “robin”) semantically related to the 
alternate meaning. 

Perhaps even more salient in this last ex- 
ample than the abrupt shift is that the pa- 
tient’s response seems only very tangentially 
consistent with the interviewer's question. 
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ments that are overly vague or overly con- 
crete, or otherwise do not seem congruent 
with the semantic or interpersonal demands 
implied by the comment or question posed 
by the other participant in the conversation. 

In contrast to the taxonomy of speech 
abnormalities described previously, which 
resulted from application of the ideational 
confusion definition, Andreasen (1986) de- 
veloped a descriptive system of assessing 
thought disorder intended to eschew the- 
oretical assumptions regarding the pathol- 
ogy resulting in disordered speech, and to 
enhance clinical assessors’ statistical reliabil- 
ity. In all, Andreasen (1986) identified eight- 
een classes of speech abnormality, most of 
which mapped loosely to more traditional 
clinical conventions. For instance, the no- 
tion of loosened associations was replaced by 
a set of five somewhat more technical cat- 
egories, including “derailment,” which the 
authors characterized as a consistent flow 
of ideas only tangentiality related to each 
other within the context of the spontaneous 
production of speech (sometimes referred 
to by clinicians as “flight of ideas”; And- 
reasen, 1956). 

For our current purposes, the specific dis- 
tinctions within Andreasen’s (1986) cata- 
log of categories are less important than 
some of the larger constructs that emerged 
from a factor analysis of observations of the 
prevalence of the abnormalities in a wide- 
ranging study of several psychiatric popula- 
tions (Andreasen & Grove, 1986). Indeed, 
as mentioned previously, five of the eight- 
een types of abnormality clustered together 
to form a “loose associations” dimension that 
seemed to indicate the overall level of behav- 
ioral disorganization shown by patients with 
psychotic disorders (Andreasen & Grove, 
1986). Another distinction appeared be- 
tween what the authors characterized as as- 
pects of “positive” and “negative” thought 
disorder (not to be confused with the analo- 
gous positive-negative schizophrenia symp- 
tom distinction) with the former involving 
aspects of loosened associations (e.g., derail- 
ment) in combination with a significant level 
of rapidity and volume of speech (sometimes 


latter involving speech that is impoverished 
in terms of average number and length ut- 
terance or with respect to ideational content 
(as was the case in the second quote cited in 
the Introduction). Interestingly, this positive 
versus negative dimension seemed to effec- 
tively discriminate thought-disordered pa- 
tients with mania from thought disordered 
patients with schizophrenia. 


Levelt’s Model of Normal 
Speech Production 


To provide an organizing framework for our 
consideration of models relevant to formal 
thought disorder, we turn first to a model of 
normal speech production. Levelt (Levelt, 
1989, 1999; Levelt, Roelofs, & Meyer, 1999) 
described such a model particularly useful 
here because of its comprehensive incorpo- 
ration of diverse cognitive processes critical 
for effective interpersonal communication. 
As shown in Figure 21.1, Levelt’s model 
involves a serial process by which a message 
intended for communication moves through 
a succession of stages, each of which plays 
a unique role in transforming the message 
into an articulated sound wave. The first set 
of stages along this speech production se- 
quence constitutes what Levelt refers to as 
a “rhetorical/semantic/syntactic system” re- 
sponsible for filtering a given communica- 
tive intention through the speaker’s model 
of how the listener will perceive and un- 
derstand the message, which can be influ- 
enced by the speaker’s mental model of the 
listener. This system also sequences ideas in 
a logical order and places that sequence in 
a propositional format (specific to linguis- 
tic expression) that includes the selection 
of lexical concepts, in turn triggering the 
retrieval of appropriate lemmas from the 
mental lexicon (for a discussion of compu- 
tational evidence for the model's lexical se- 
lection mechanism, see Levelt, Roelofs, & 
Meyer, 1999; also see Medin & Rips, Chap. 
3). The retrieval of the appropriate lem- 
mas from the mental lexicon engages the 
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Figure 21.1. Levelt’s model of normal speech production. Reprinted with permission from Levelt, 


1999. 


syntactic construction of the message, for 
lemmas must agree syntactically with each 
other and with the overall communicative 
intent of the speaker. 

This retrieval of lemmas from the men- 
tal lexicon, which also entails retrieval 
of each lemma’s inherent morpho-phono- 
logical code, serves as a transition out of the 
“rhetorical/semantic/syntactic system” and 
into the “phonological/phonetic system.” In- 
deed, the lemmas’ score in the mental lex- 
icon represents the basic stage at which 
semantic and phonological information is 
bound together. 

Accordingly, the phonological codes asso- 
ciated with each lemma’s morphemes com- 
bine according to the predetermined se- 
quence to form the syllabic structure of 


the message, a relative process, the prod- 
uct of which does not necessarily respect 
the boundaries of the superordinate lem- 
mas. Next, during the process of phonetic 
encoding, the accumulation of the phono- 
logical syllables, or the phonological score, 
retrieves from a “mental syllabary” a gestu- 
ral, or articulatory score, completing the pro- 
cess by which a fully formed syntactic and 
phonological message retrieves an appropri- 
ate articulatory motor plan. Subsequently, 
articulation, the generation of overt speech, 
is the physical realization of the selected 
motor plan. 

The production of overt speech, however, 
does not represent the final stage in Lev- 
elt’s model of speech production. In fact, 
the model also includes a feedback loop by 
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his or her own speech for errors or external 
interference, re-engaging the model at the 
level of conceptual preparation to make ap- 
propriate corrections if necessary. 

On a neural level, Indefrey and Levelt 
(1999) describe the functioning of Levelt’s 
model as being implemented in a primar- 
ily left-hemisphere-lateralized cortical net- 
work. They propose that the initial process 
of conceptual preparation occurs in range of 
heteromodal and cortical association areas 
(specific to the modality of contextual in- 
formation preceding the present production 
process), the activity of which converges 
with the selection of a lexical concept oc- 
curring in the left middle temporal gyrus. 
Subsequently, Wernicke’s area (roughly the 
temporal-parietal junction) is activated by 
the retrieval of phonological codes associ- 
ated with retrieved lexical concepts followed 
by activation of Broca’s area (posterior left 
inferior frontal cortex) and the left mid- 
superior temporal lobe, the sites at which 
phonological encoding continues indepen- 
dent of lexical information. Broca’s area then 
remains active and is joined by activation 
in other supplementary motor areas and in 
the cerebellum during the process of ar 
ticulation. Indefrey and Levelt (1999) fur- 
ther specify that self-monitoring, whether 
occurring covertly or overtly, activates re- 
gions of superior temporal lobe, as well 
as supplementary motor areas related to 
articulation. 

Critically, the authors specify that this 
proposed speech production network is ac- 
tivated as such only during relatively auto- 
matic (i.e., seemingly without effort or con- 
scious awareness and potentially occurring 
in parallel with other processes) speech pro- 
duction as opposed to the process of speech 
production specifically engaged during more 
controlled (effortful, conscious processing 
requiring capacity-limited attention and 
operating in a serial fashion; Schneider 
& Shiffrin, 1977) information processing, 
as would be more likely during the perfor- 
mance of an experimental cognitive task. 
Not only would speech production involv- 
ing controlled selection, retrieval, and in- 


to activate the network previously described 
(Indefrey & Levelt, 1999), but it would also 
likely activate a relatively more anterior re- 
gion of left inferior prefrontal cortex (Gold 
& Buckner, 2002; Kounios et al., 2003) that 
appears to facilitate controlled selection of 
information stored in long-term memory by 
resolving interference from activated, non- 
target pieces of information (Thompson- 
Schill et al., 2002). 


Thought Disorder or 
Speech Disorder? 


An area of long-running controversy in the 
study of formal thought disorder is whether 
the phenomenon is ultimately a disorder 
of thought itself, or a disorder of overt 
speech. Specifically, rather than considering 
this markedly disrupted ability to commu- 
nicate a speech production problem, we in- 
fer that the locus of pathology lies in the 
thought processes underlying the intentional 
production of speech (see Chaika, 1982, for 
additional discussion of this inference). Un- 
fortunately, as discussed in depth by Critch- 
ley (1964) and Maher (1991), these thought 
processes themselves are not directly observ- 
able. Therefore, measurement of any puta- 
tive disruption must necessarily occur indi- 
rectly — usually with the assumption that the 
psychomotor transformation from a thought 
to a spoken utterance occurs with a nor 
mal range of fidelity. It is certainly worth- 
while considering whether this assumption 
is warranted. 

Referring back to Levelt’s model of nor- 
mal speech production (Figure 20.1), we 
can consider each of the putative process- 
ing stages and attempt to infer what the 
observable product of “lesioning” each in iso- 
lation would sound like. Let us first examine 
the processing stage most closely affiliated 
with the actual act of speaking, the process 
of physical articulation. If an intact would- 
be utterance moves into the stage of phys- 
ical articulation only to be compromised, 
one might expect the output to contain the 
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spoken in a manner that systematically dis- 
torts the articulatory score of the phrase. 
Such a spoken product would not resem- 
ble formally disordered thought but instead 
the product of conditions such as dysarthria 
or speech apraxia (Dick et al., 2001), two 
disorders familiar to neurologists and speech 
pathologists. 

Similarly, if the lesion underlying formal 
thought disorder involved the process of 
phonetic encoding, one would expect spo- 
ken output to resemble speech characteris- 
tics of what Dodd and Crosbie (2002) refer 
to as “speech programming deficit disor- 
der” in which speech is produced fluently, 
but the distorted phonological score would 
yield speech devoid of normal patterns of 
pitch and syllabification — perhaps sound- 
ing severely slurred — in the absence of 
dysarthria or speech apraxia. 

The immediately preceding stage, mor 
phophonological encoding, the first point in 
the process at which a word’s phonolog- 
ical code is processed independent of se- 
mantic content, has also been shown to be 
compromised in isolation specifically in pa- 
tients suffering from an anomic aphasia or 
a word-finding deficit (Indefrey & Levelt, 
1999). Typically, such patients describe a 
sense of frustration over feeling that they 
have particular verbal concepts in mind but 
cannot retrieve the phonological code — can- 
not think of how to say the corresponding 
word. Certainly this condition is debilitat- 
ing, but apart from the superficial similarity 
with thought blocking (sometimes consid- 
ered a feature of negative formal thought 
disorder, but not a construct included in 
Andreasen’s and Grove’s system), anomic 
aphasia does not resemble formal thought 
disorder. 

Finally, having ruled out deficits in the 
stages of speech production constituting 
Levelt’s (1989, 1999) phonological/phonetic 
system, we work backward to the stage at 
which lemmas are selected and retrieved 
from the mental lexicon, initiating the gram- 
matical encoding process. Agrammatic pa- 
tients, such as patients suffering from Broca’s 
aphasia, are characterized by speech in 


propriately, but the particular form of each 
word is not adjusted to accommodate the 
grammatical demands of nearby words or 
the phrase as a whole (e.g., verbs are not con- 
jugated correctly; Indefrey & Levelt, 1999). 
Although, as apparent in the quotes at 
the beginning of the chapter, patients with 
formal thought disorder make grammatical 
mistakes in their speech, it is not necessar- 
ily clear that they make such mistakes more 
frequently than non-thought-disordered in- 
dividuals do. 

Evidence does exist (Andreasen & Grove, 
1986; Berenbaum & Barch, 1995), however, 
that patients with formal thought disorder 
show a small but significant level of word 
substitution and approximation, which is 
the predictable consequence of faulty re- 
trieval of lemmas from the mental lexicon, 
the initial process occurring under the head- 
ing of grammatical encoding. We therefore 
conclude that we can rule out lesions to 
all processing stages occurring after lemma 
retrieval up to and including the articula- 
tion of overt speech. The cause of formal 
thought disorder must therefore exist some- 
where along the way through the rhetor- 
ical/semantic/syntactic system (including 
application of a mental model of the lis- 
tener, the conceptual preparation, etc.) orin 
the self-monitoring feedback loop. Although 
where one draws the line between “thought” 
and “speech” is a somewhat of a philosoph- 
ical issue, we propose that all processes un- 
derlying these two suspect components cer- 
tainly warrant the label “thought,” justifying 
the term “formal thought disorder” rather 
than “speech disorder.” 


Overview of Cognitive Models of 
Thought Disorder 


The first major psychological discussion of 
the pathology of thought disorder was pro- 
vided by Eugen Bleuler, a Swiss psychi- 
atrist and theorist contemporary to both 
Sigmund Freud, founder of modern clini- 
cal psychology, and Wilhelm Wundt, often 
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psychology. Based on his observation of pa- 
tients with psychotic disorders (i.e, psy- 
chiatric disorders manifesting both severe 
reality distortion symptoms, such as hal- 
lucinations and delusions, and a significant 
level of behavioral disorganization), includ- 
ing schizophrenia, Bleuler (1911/1951) ar- 
gued that the cause of psychosis involves a 
fundamental “loosening of associations” be- 
tween ideational elements, which results in 
a conceptual confusion that manifests itself 
in disordered speech (in addition to other 
symptoms) — an idea preserved almost ex- 
actly in its original form in more contempo- 
rary definitions of thought disorder,” as dis- 
cussed earlier. 

Furthermore, Bleuler’s conceptualization 
of the pathology of psychosis is analogous 
to more contemporary cognitive explana- 
tions of disordered information processing 
(eg., Andreasen et al., 1999; Goldman- 
Rakic, 1995; Oltmanns & Neale, 1975; Sil- 
verstein et al., 2000) in that he proposed 
that a critical parameter of a fundamental 
cognitive mechanism is abnormal and that 
the consequences of this single defect ac- 
count parsimoniously for the diverse phe- 
nomena observed in the behavior of many 
psychiatric patients. Like Bleuler, the propo- 
nents of these contemporary models iden- 
tify the pathological cognitive mechanism 
and delineate how the functional conse- 
quences of the abnormality are propagated 
through subsequent processing stages, cur- 
tailing the normal integration of thought 
and behavior or, more specifically with re- 
spect to schizophrenia, the inability to use 
contextual information in the efficient guid- 
ance of ongoing, goal-directed behavior. We 
shall discuss at length two such models that 
were created in investigation of informa- 
tion processing abnormalities in schizophre- 
nia and examine how these models might ac- 
count for symptoms such as formal thought 
disorder. 


Cohen’s and Braver’s Model 


Cohen, Braver, and colleagues (Braver, 
Barch, & Cohen, 1999; Braver et al., 2001; 


Schreiber, 1992) have proposed a model of 
schizophrenic information processing (see 
Figure 21.2) in which at least a subset of 
information processing deficits observed in 
schizophrenia patients results from a distur- 
bance in the interaction between a cogni- 
tive module specialized for the representa- 
tion, active maintenance, and updating of 
information regarding stimulus context and 
a module responsible for the storage of 
learned behavioral contingencies. Given 
that an individual’s repertoire of stimulus— 
response associations must be directly ac- 
cessible to the behavioral selection process 
bridging the gap between the encoding of 
a stimulus and the execution of a response, 
the existence of a pathway allowing interac- 
tion between these stored behavioral contin- 
gencies (i.e., long-term memories, including 
motor plans) and the context processing 
module allows contextual information to in- 
fluence the selection and execution of ongo- 
ing behavioral plans, ideally biasing behavior 
in a goal-appropriate manner (Braver, Barch, 
& Cohen, 1999). Context information there- 
fore mediates the selection of learned asso- 
ciations, which otherwise would be dictated 
by environmental stimuli. 

To serve this function, context informa- 
tion must be represented and maintained in 
a manner that leaves it both buffered against 
interference (from task-irrelevant stimuli) 
and available to be updated as required 
by changing task demands. In an extreme 
example, this system must be capable of 
exercising cognitive control: utilizing infor- 
mation from a previous stimulus to bias pro- 
cessing of other relevant information and 
to suppress processing of irrelevant infor- 
mation and then reflecting that critical in- 
formation in the selection of appropriate 
goal-directed behavior even in the face of 
competition from more salient behavioral 
responses. > 

Citing computational evidence (Braver, 
Barch, & Cohen, 1999), the investigators 
argue that variable efficiency of the in- 
teraction between the context processing 
module and the learned behavioral con- 
tingencies module could, in fact, account 
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Figure 21.2. Cohen, Braver, and colleagues’ model of information processing disruption in 
schizophrenia. Reprinted with permission from Braver, Barch, & Cohen, 1999. 


for schizophrenia patients’ apparent insen- 
sitivity to contextual information. They 
elaborate (Braver, Barch, & Cohen, 1999) 
that a gating mechanism must exist be- 
tween the two modules, allowing contextual 
information to be encoded and main- 
tained without interference from irrelevant 
perceptual information under certain cir 
cumstances and, at other times, making con- 
text information available for updating or to 
influence activation in the association stor 
age module. Disrupted information process- 
ing in schizophrenia is therefore the con- 
sequence of failure of this “gate” between 
the association storage module and the con- 
text processing module to properly open 
and close, degrading the fidelity of encoding 
and maintenance of goal-related informa- 
tion, as well as the effectiveness of its biasing 
influence. 

In support of their model, Cohen, Braver, 
and colleagues (Cohen et al., 1999) have pre- 
sented data from schizophrenia patients and 
controls performing three tasks — a single- 
trial version of the Stroop task (Stroop, 
1935), a lexical disambiguation task, and a 
“one-back” continuous performance task re- 
quiring subjects to continuously match each 


stimulus with the stimulus presented im- 
mediately prior (Cohen & Servan-Schreiber, 
1992). In each task, the difficulty of main- 
taining context information and using it to 
select appropriate behavior was manipu- 
lated by varying the length of time during 
which context information must be main- 
tained prior to response selection, as well 
as the salience of task-appropriate responses 
relative to task-inappropriate responses (i.e., 
the demand for cognitive control during 
the behavioral selection process). The in- 
vestigators argue that, overall, schizophrenia 
patients display a differential insensitivity 
to contextual information, and this insen- 
sitivity interacts with variable information 
maintenance demands in two out of three 
experiments (Cohen et al., 1999). Addi- 
tionally, the investigators report a signifi- 
cant negative correlation between context 
sensitivity and severity of disorganization 
symptoms (including formal thought disor 
der) among schizophrenia patients, suggest- 
ing that the ability to effectively and flexibly 
bind ideational elements to an appropriate 
context underlies both the production of or- 
ganized speech and successful performance 
on these context-heavy tasks. 
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to bear evidence that this contextual infor 
mation is actively maintained, updated, and 
buffered against interference in the dorso- 
lateral prefrontal cortex (Barch et al., 1997; 
Braver et al., 1997; Cohen et al., 1997; also 
see Goel, Chap. 20), which is also implicated 
in the exercise of cognitive control (Braver, 
Reynolds, & Donaldson, 2003). More re- 
cently, Miller and Cohen (2001; also, Kane 
and Engle, 2002, and Duncan, 2001) have 
reviewed evidence that the prefrontal cor- 
tex is not only capable of maintaining rep- 
resentations of context despite interference 
but is also critical in the modulation of activ- 
ity in other regions of the brain thought to 
be associated with modality-specific buffers, 
with the ability to hold long-term memo- 
ries at a high level of activation and the sub- 
sequent selection of goal-directed behavior. 
Incorporating an additional level of analysis, 
Cohen, Braver, and colleagues cite evidence 
that phasic dopamine activity modulates the 
gate between the prefrontal context process- 
ing module and the individual’s repertoire 
of learned behavioral contingencies (Braver, 
Barch, & Cohen, 1999). 

Although we agree that dopaminergic 
modulation of cortical activity certainly 
plays a role in the pathology of impaired in- 
formation processing in psychosis (e.g., Abi- 
Dargham et al., 2002; Okubo et al., 1997), 
the particular mechanism Cohen and col- 
leagues propose (i.e., increased tonic and 
decreased phasic dopamine activity; Braver, 
Barch, & Cohen, 1999), however, remains 
controversial (see, for example, Grace, 1991, 
or Laruelle, Kegeles, & Abi-Dargham, 2003). 

In addition to being able to account 
qualitatively for the cognitive deficits the 
model was designed to simulate (Braver, 
Barch, & Cohen, 1999), the proposition that 
schizophrenia patients fail to appropriately 
use contextual information to guide ongo- 
ing behavior in a goal-directed manner cer- 
tainly has face validity.t One might argue, 
however, that any behavior judged to be ab- 
normal, or more specifically, deficient with 
respect to a given goal state, could be ex- 
plained by a failure of this context process- 
ing mechanism. 


this proposition raises a question regarding 
how this model or any like it is distinct from 
one that simply predicts that schizophre- 
nia patients will perform any given task in- 
correctly. The distinction indeed exists and 
highlights the reason why cognitive control 
is so critical to the model’s successful im- 
plementation. Specifically, patients will per- 
form a given task correctly when the cor- 
rect behavioral response is somehow most 
salient or dominant with respect to other po- 
tential responses; in this case, the represen- 
tation of context and the prepotency of the 
correct response are redundant mechanisms. 
When the correct response is less salient 
or less “prepotent” than an incorrect, dis- 
tracter response, patients will tend to choose 
the distracter. Nonpsychotic subjects, con- 
versely, will be more capable of using 
representations of context to inhibit the pre- 
potent distracter and select the appropriate, 
less salient behavioral response — they will be 
more capable of exercising cognitive control. 

This focus on cognitive control therefore 
represents a critical step in the develop- 
ment of this model — a process that should 
continue to advance, incorporating findings 
from studies of neural correlates of cognitive 
control (e.g., Braver et al., 2003), the cogni- 
tive mechanisms underlying recognition and 
resolution of response conflict (Botvinick 
et al., 2001), and the specificity of the find- 
ings to patients suffering from psychosis 
(Barch et al., 2003). 

Finally, two additional issues awaiting res- 
olution are also worthy of brief mention. 
The first area involves the mechanism by 
which particular behavioral responses ac- 
quire their levels of salience, or prepotency. 
Cohen, Braver, and colleagues refer to be- 
havioral learning principles to account for 
how associations are formed between par- 
ticular pieces of contextual information and 
specific outcomes (Braver, Barch, & Carter, 
1999), linking contextual information to in- 
centive salience, and therefore to behavioral 
response salience; however, they do not ac- 
count for the initial identification and cate- 
gorization of pieces of information (unless a 
stochastic process of sampling reward value 
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Figure 21.3. Hemsley’s and Gray’s model of disrupted information processing in 
schizophrenia. Reprinted with permission from Gray et al., 1991. 


from among the set of available stimuli is 
assumed), nor do they argue that associa- 
tions among behaviors, contextual informa- 
tion, and outcomes will generalize across 
situations. 

Additionally, future discussion of con- 
textual information, as defined by Cohen, 
Braver, and colleagues, might benefit from 
consideration of how this particular con- 
struct relates to definitions of context in 
other fields of research within cognitive psy- 
chology and neuroscience. Borrowing an ex- 
ample from the study of conditioning in non- 
human animals, investigators predictably 
define context as the aspects of the phys- 
ical setting in which a particular condition- 
ing trial takes place that are immediately ob- 
servable by the animal (e.g., Fanselow, 2000; 
Goddard, 2001; Rudy & O'Reilly, 2001). 
This definition of context is relatively consis- 
tent and uniform across studies, facilitating 
the construct’s incorporation into behavioral 
models and the subsequent generalization of 
those models to analogous, ecologically valid 
situations for which the model can generate 
behavioral predictions (such as the behavior 
of a recovered drug addict in a physical set- 
ting with which drug use is associated; e.g., 
Shaham et al., 2003). Cohen, Braver, and 
colleagues (Cohen et al., 1999) on the other 
hand, seem to define context purely in terms 
of performance an cognitive tasks. These am- 
biguities aside, as we will discuss, this and the 
following model provide critical theoretical 


traction in our attempt to understand how 
information processing abnormalities might 
contribute to the manifestation of thought 
disorder. 


Hemsley’s and Gray’s Model 


A model with properties analogous to as- 
pects of the model developed by Cohen 
and colleagues, but with important incon- 
gruities as well, has developed in a body 
of publications authored by Hemsley, Gray, 
and colleagues over the past two decades 
(Gray, 1982; Gray, 1995; Gray, 1998; Gray 
et al., 1991; Hemsley, 1987; Hemsley, 1993; 
Hemsley, 1994; Weiner, 1990). As summa- 
rized most recently by Gray (i998), this 
model of disordered information process- 
ing in schizophrenia involves a disruption 
in the processes by which past regulari- 
ties of experience are integrated with on- 
going stimulus recognition and behavior se- 
lection and monitoring (see Figure 21.3). 
This failure to engage information fluidly 
from longer-term memory in the interpre- 
tation of the current perceptual state of 
the world and the prediction of subsequent 
states is essentially the failure of an informa- 
tion processing system to identify and utilize 
contextual information in the automatic 
guidance of goal-directed behavior. As de- 
lineated by Gray (1998), what should seem 
familiar to the patient and elicit auto- 
matic processing of information (seemingly 
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potentially, in parallel with other processes) 
instead seems novel, engaging finite, con- 
trolled information processing resources (ef- 
fortful, conscious processing requiring atten- 
tional focus and operating in a serial fashion; 
Schneider & Shiffrin, 1977). Consistent with 
an earlier proposal put forth by Nuechterlein 
and Dawson (1984), Hemsley (1994) and 
Gray (i998) argue that schizophrenia pa- 
tients are significantly more likely to engage 
these controlled processes than are nonpsy- 
chotic subjects, resulting in patients’ en- 
gaging information processing bottlenecks 
significantly more frequently, and, through 
physiological mechanisms discussed subse- 
quently, this disparity leads to the conscious 
experience of psychosis. 

Similar to the model discussed previously, 
Hemsley’s and Gray’s model accounts for 
the influence of contextual information on 
the goal-oriented direction of behavior. Un- 
like the previous model, however, Hemsley 
and Gray (Gray et al., 1991) mention explic- 
itly that their model includes a dedicated 
comparator mechanism that examines the 
products of regular perceptual sampling of 
the environment within the context of a pre- 
dicted model of the perceptual world (cor- 
rected for the influence of ongoing motor 
plans, as well as other dynamic aspects of the 
perceptual world stored in long-term mem- 
ory). The results of this comparison are then 
abstracted according to the degree to which 
they match the prediction and transmitted 
to the motor programming system, which 
interprets a relative “match” signal as an in- 
dication that it should allow the current mo- 
tor program to continue (i.e., “the behaviors 
executed are having the predicted effects”) 
and a relative “mismatch” signal as an indi- 
cation that it should interrupt the ongoing 
motor plan because something novel or un- 
expected has occurred. 

However, the presence of a relative mis- 
match signal orients the individual’s atten- 
tion to the possibility of a meaningful change 
in the perceptual environment, increasing 
the intensity of sensory processing (Gray, 
1998) — a proposition that converges with 
Sokolov’s (1963) suggestion that the auto- 


mans and other animals occurs in order to 
increase the cognitive resources available for 
sensory processing. 

For individuals suffering from schi- 
zophrenia, the fundamental deficit “... lies 
at the moment of integration of past ex- 
perience with current information han- 
dling” (Gray, 1998, p. 261). Specifically, 
information about past regularities of expe- 
rience is not integrated fluently with cur- 
rent perceptual information, preventing the 
system from making an appropriate pre- 
diction about the next state of the per- 
ceptual world and markedly decreasing the 
likelihood that a match signal will be gen- 
erated. Consequently, the impaired individ- 
ual experiences the detection of novelty in 
the perceptual environment much more fre- 
quently than would an individual generat- 
ing more frequent match signals. In light of 
the increased sensory processing demands 
and the concomitantly increased demand 
placed on Gray’s comparator mechanism, 
as well as the need to select and initiate a 
different motor program, the cognitive pro- 
cessing demands once fulfilled automatically 
now require capacity-limited, controlled 
processing. 

This conjecture does seem to reflect 
subjective experiences reported by many 
schizophrenia patients, who describe feel- 
ing overwhelmed by a somehow foreign- 
seeming, disjointed perceptual landscape, 
unable to discern more meaningful features 
of the environment from less meaningful 
features (Davis & Cutting, 1994; McGhie & 
Chapman, 1961). Indeed, Hemsley’s (1994) 
and Gray’s (1998) proposal that actively psy- 
chotic schizophrenia patients engage their 
sensory environment in a much more at- 
tentionally intensive manner, all the while 
sensing endogenous indications that some 
aspect of that environment is novel or un- 
expected, appears to account for patients’ 
reports of attributing increased significance 
to aspects of the environment that non- 
schizophrenic individuals would consider in- 
significant (Davis & Cutting, 1994), poten- 
tially participating in the development of 
delusional beliefs (e.g., Maher, 2002). 
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ond consequence of the pathologically fre- 
quent interruption of ongoing motor pro- 
grams in schizophrenia patients: disruption 
in the “labeling” of interrupted motor pro- 
grams as internally generated — a conse- 
quence of impaired self-monitoring. That is, 
the patient recognizes the results of the (at 
least partial) execution of a motor program, 
but does not attach a sense of personally 
willed intention to the motor program — a 
mechanism first proposed by Frith (1987). 
Considering that speech — even covert or 
subvocal speech - essentially constitutes a 
complex motor program, a consequence of 
a failure to recognize that motor program as 
a behavior willfully enacted by oneself may 
lead to the conclusion that the speech expe- 
rienced was generated by an agent other than 
the individual — the definition of an auditory 
hallucination (see Ford et al., 2002, for a pos- 
sible fronto-temporal correlate of this phe- 
nomenon). In this manner, Gray (1998) and 
Frith and colleagues (1992) argue the failure 
to associate willed intention with the exe- 
cution of a motor program can lead to the 
experience of a significant perceptual abnor- 
mality, such as a hallucination. 

With respect to its neural implementa- 
tion, Hemsley’s and Gray’s model (Gray, 
1998) focuses on regulatory functions of 
dopamine, as does Cohen’s and Braver’s; un- 
like Cohen’s and colleagues’ model, how- 
ever, it places greatest emphasis on the 
dopaminergic modulation of a structure 
other than the prefrontal cortex — namely, 
the nucleus accumbens, a site of great inte- 
gration of disparate neural circuits, located 
in the basal ganglia. Gray (1998) posits that 
the comparison between predicted and ob- 
served perceptual information is carried out 
in the ventral portion of the frontal lobe, 
and the results are transmitted through a me- 
dial temporal lobe pathway to the nucleus 
accumbens. Importantly, this excitatory, 
glutamatergic input to the nucleus accum- 
bens is paired with an inhibitory efferent 
connection from the dopamine-releasing nu- 
clei of the midbrain. Gray suggests that when 
the excitatory input is disrupted, the nucleus 
accumbens receives a relative overload of 


ing the ability of the ventral frontal lobe to 
communicate match signals and in turn set- 
ting off a chain reaction of inhibitory steps 
throughout the basal ganglia, eventually in- 
hibiting the reticular nucleus of the thala- 
mus. Once the reticular nucleus is inhibited, 
the excitatory, largely feed-forward loops 
comprising thalamocortical sensory infor 
mation processing circuits are left relatively 
unchecked — a consequence possibly related 
to the subjective sense of increases in the de- 
gree of conscious sensory processing under- 
way (also see Grace, Moore, & O’Donnell, 
1998). Moreover, this thalamocortical disin- 
hibition and concomitant sense of increased 
conscious processing of stimuli facilitates pa- 
tients’ and controls’ differential engagement 
of highly controlled cognitive processes — a 
functional dissociation seen most strikingly 
in the prefrontal cortex (Jansma et al., 2001). 
Furthermore, this thalamic disinhibition dis- 
rupts the functioning of parietal and inter- 
connected prefrontal areas active during the 
attribution of overt behavior as being self- 
generated (Frith et al., 1992) — an operation 
closely related to the functioning of the ven- 
tral prefrontal comparator. 

Two areas of concern warrant brief men- 
tion. The first involves the possibility that 
the model’s predictions might prove rel- 
atively nonspecific with respect to the 
primary locus of pathology: Any number 
of disruptions in the proposed information 
processing system would result in a marked 
drop in the number of match signals re- 
ceived, leading to a greater degree of con- 
trolled processing. In addition to this po- 
tential nonspecificity, one might argue that 
the behavioral evidence cited in support 
of the model does not easily map onto 
the clinical phenomena for which it at- 
tempts to account. Although Hemsley’s and 
Gray’s model seems to relate in meaning- 
ful ways to the subjective experiences of 
schizophrenia patients, much of the behav- 
ioral evidence supporting it (Gray et al., 
1991; Gray, 1998) is drawn from studies of 
latent inhibition (Lubow, 1959) — a phe- 
nomenon of classical conditioning defined as 
the difference in amplitude or intensity of 
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uli of two types: stimuli to which the abject 
was already exposed prior to association 
with the present response and stimuli oth- 
erwise novel to the subject when first 
associated with the present response. An 
association usually occurs more readily be- 
tween the “non-pre-exposed” stimulus and 
the response, which is a phenomenon be- 
lieved to be associated with an inhibition of 
association formation caused by the persist- 
ing representation of the “pre-exposure” ex- 
perience of the stimulus. The contextual in- 
formation comprised by the representation 
of the pre-exposure therefore influences the 
efficiency with which a subsequent behay- 
ioral association is formed. Taken alone, ev- 
idence that schizophrenia patients do not 
show expected latent inhibition effects may 
be interpreted as a failure by patients to uti- 
lize contextual information in the behav- 
ioral conditioning domain. One might ques- 
tion, however, whether evidence taken pri- 
marily from classical conditioning serves as 
an adequate foundation for an information 
processing model as wide-ranging and com- 
plex as Gray’s and Hemsley’s and that carries 
implications for elusive aspects of cognition 
such as conscious awareness. 


Studies of Information Processing Deficits 
Related to Formal Thought Disorder 


To help us fill in the gap in available 
theory between mechanisms underlying a 
modality-nonspecific degradation in infor 
mation processing ability and the mech- 
anisms generating organized, goal-directed 
speech, we turn to the literature on 
(quasi) experimental approaches to study- 
ing the pathology of formal thought disorder. 
Thankfully, this work has been examined 
capably in a recent meta-analysis by Kerns 
and Berenbaum (2002), who organize the 
range of published hypotheses involving spe- 
cific cognitive impairments associated with 
formal thought disorder into four general 
categories. 

We have already discussed the first of 
these categories, involving investigations of 


9 @settitive mechanisms relatively proximal to 


speech production, such as those included in 
Levelt’s phonological/phonetic system (e.g., 
Barch & Berenbaum, 1996, and Goldberg 
et al., 1998). In agreement with our con- 
clusion, Kerns and Berenbaum (2002) re- 
port only a very minor relationship between 
phonological/phonetic system impairment 
and ratings of thought disorder and argue 
that this relationship is carried entirely by 
measures of anomia and word substitution 
and approximation, deficits likely related to 
the retrieval of lemmas from the mental lexi- 
con (Indefrey & Levelt, 1999). The vast ma- 
jority of clinical phenomena related to for- 
mal thought disorder (Andreasen & Grove, 
1986), however, is left unaccounted for by 
deficient speech production. 

Kerns’ and Berenbaum’s (2002) second 
category of hypothesized deficit involves in- 
creased amount of activation spreading au- 
tomatically between nodes in semantic net- 
works (assumed to operate like standard 
neural network models; Dell & O’Seaghdha, 
1991), resulting in increased priming of 
nearby semantic associates of a target word, 
raising the probability that one of these non- 
target words will be retrieved and integrated 
into ongoing speech. A relatively intense 
area of study in schizophrenia research (for 
a review, see Minzenberg et al., 2002), in- 
vestigators looking for evidence of abnormal 
semantic network priming have reported 
seemingly contradictory findings, with some 
showing evidence of hyper-priming at tested 
nodes (suggesting increased amount of ac- 
tivation spreading throughout the network; 
Spitzer et al., 1993, 1994; Weisbrod et al., 
1998) and others showing evidence of hypo- 
priming at tested nodes (suggestive of a re- 
duced amount of activation; Blum & Frei- 
des, 1995; Passerieux et al., 1997; Besche 
et al., 1997). This contradiction prompted 
the suggestion that thought-disordered pa- 
tients actually experience an increase in 
distance of activation spread, while main- 
taining an overall level of activation compa- 
rable with controls, effectively yielding an 
increased number of nodes activated, with 
none activated to as high a degree as controls’ 
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and Berenbaum (2002) reject the hyper- 
priming hypothesis, and indeed report that a 
small amount of evidence exists supporting 
increased distance of activation spread and 
decreased amount of activation at any given 
node, suggesting that a thought-disordered 
patient should be slightly more likely than 
controls to retrieve a word relatively dis- 
tantly related to the target word. 

Aside from this evidence of a relatively 
minor contribution to the expression of for- 
mal thought disorder, the first deficit shown 
by Kerns and Berenbaum (2002) to con- 
tribute significantly to thought disorder in- 
volves semantic memory functioning rel- 
atively distinct from automatic spreading 
of activation, such as impairment of con- 
trolled retrieval of information from seman- 
tic memory, which may itself have an ab- 
normal netware structure (because of the 
cumulative effects of a chronic inability to 
encode semantic information, for instance). 
Relevant studies (e.g., Allen et al., 1993; 
Goldberg et al., 1998; Kerns et al., 1999) 
tend to employ fluency tasks requiring re- 
trieval of information from semantic mem- 
ory by means such as a controlled imple- 
mentation of retrieval strategy (Ruff et al., 
1997). In agreement with conclusions of- 
fered by Minzenberg, Ober, & Vinogradov 
(2002) and by Baving and colleagues (2001), 
all of whom argue that semantic retrieval is 
most consistently and robustly impaired in 
schizophrenia patients when a high degree 
of controlled processing is required, Kerns 
and Berenbaum (2002) present evidence of 
a strong, consistent association between this 
type of semantic processing abnormality and 
presence of formal thought disorder. They 
argue additionally that the current literature 
does not offer evidence permitting a disam- 
biguation between abnormal network struc- 
ture and impaired information retrieval. 

Evidence of impaired semantic retrieval 
associated with formal thought disorder is 
consistent with Hemsley’s and Gray’s (Gray 
et al., 1991; Gray, 1998) model’s focus 
on the smooth integration of stored in- 
formation with incoming information and 


tion from (long-term) semantic memory is 
continuously retrieved and integrated into 
comprehension and online production of 
verbal behavior. Failure of this fluid integra- 
tion therefore has the ability to ultimately 
prevent a match signal from being gener 
ated, engaging (albeit indirectly) capacity- 
limited, controlled processing resources, and 
likely recruiting activation of left inferior 
prefrontal cortex to facilitate the otherwise 
automatic selection of semantic information 
mediated by activity in the left middle tem- 
poral gyrus (Gold & Buckner, 2002; Indefrey 
& Levelt, 1999). 

This process of semantic memory re- 
trieval and integration itself may be mod- 
ulated by the subject of Kerns’ and Beren- 
baum’s (2002) fourth category of cognitive 
deficit contributing to formal thought dis- 
order — namely, impaired executive func- 
tioning. As a composite construct, Kerns and 
Berenbaum (2002) demonstrate that execu- 
tive function abnormality is strongly related 
to the presence of formal thought disorder. 
Of course, executive function itself entails 
a number of critical subsystems (Baddeley, 
1986), including a mechanism for process- 
ing contextual information (and effectively 
inhibiting irrelevant, noncontextual infor 
mation), a mechanism for allocation of at- 
tentional capacity serving to maintain in- 
formation over a delay, and a mechanism 
for monitoring one’s own behavior, in- 
cluding speech. 


CONTEXT/SELECTIVE ATTENTION 


Consistent with Cohen’s and Braver’s 
model, there is indeed considerable evi- 
dence that thought-disordered patients suf- 
fer from abnormal processing of contex- 
tual information. In fact, Levelt’s model of 
speech production incorporates contextual 
information at numerous stages, such as dur- 
ing conceptual preparation (when interper- 
sonal context is considered, for instance). 
Additionally, the process of lexical selec- 
tion may be influenced by discourse context 
(Horn & Ward, 2001), which describes the 


representation of previously uttered verbal 
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sure that subsequent utterances will show 
adequate structural continuity with and se- 
mantic and conceptual relevance to the over- 
arching conversation. Numerous investiga- 
tors have examined schizophrenia patients’ 
capacity to use discourse context to guide 
selection of verbal behaviors. Studies us- 
ing the traditional cloze procedure (Taylor, 
1953), in which the subject reads a block 
of text missing every fourth or fifth word 
and must attempt to use the context pre- 
ceding each blank to guess what word is re- 
quired, have found that psychotic patients 
tend to show impaired performance (re- 
viewed in Cozolino, 1983); however, several 
marked methodological limitations of the 
procedure (Maher, 1991) cast uncertainty 
on interpretation of those findings. A great 
number of studies have taken a different 
approach (Benjamin & Watt, 1969; Chap- 
man & Chapman, 1973; Cohen & Servan- 
Schreiber, 1992; Kuperberg, McGuire, & 
David, 1998; Sitnikova et al., 2002) using 
various lexical disambiguation tasks that re- 
quire the subject to use contextual informa- 
tion from preceding clauses to determine the 
relevant meaning of a homograph, or a word 
with multiple possible definitions. 

These and other investigators have gener- 
ally concluded that patients with psychotic 
disorders fail to demonstrate sensitivity to 
the biasing influence of preceding contextual 
information; however, Chapman and Chap- 
man (1973) refined this conclusion, arguing 
that patients fail to demonstrate sensitivity 
to discourse context only when it suggests a 
homograph’s nondominant meaning. They 
characterized this deficit as “excessive yield- 
ing to normal biases,” or a tendency to utilize 
dominant meanings. For instance, when one 
patient was asked to interpret the proverb 
“One swallow does not make a summer,” he 
responded, “When you swallow something, 
it could be all right, but the next minute you 
could be coughing, and dreariness and all 
kind of miserable things coming out of your 
throat” (Harrow & Quinlan, 1985, p. 436). 
The patient clearly demonstrated a bias to- 
ward the more dominant meaning of the 
word “swallow,” despite the fact that the 


dominant meaning. 

Of course, excessive yielding to normal 
biases is the logical complement of Cohen’s 
and Braver’s cognitive control mechanism, 
which is defined by its ability to overcome 
these normal biases. Accordingly, Cohen, 
Braver, and colleagues argue that an indi- 
vidual’s representation of discourse context, 
as well as his or her goals for the inter- 
action (e.g., make a particular point, com- 
municate in a certain manner) constitute 
contextual information, guiding the ongo- 
ing implementation of related semantic con- 
cepts (Botvinick et al., 2001). Failure to 
encode, update, or maintain this contextual 
information therefore leads to a failure to 
utilize discourse context to constrain and se- 
lect subsequent verbal output, appearing to 
the observer as a relative lack of association 
between units of language output. 

Moreover, if failure to encode, main- 
tain, or implement contextual information 
is, in fact, a mechanism underlying formal 
thought disorder, it may explain a long-held 
piece of clinical wisdom — specifically, dis- 
ordered speech is more likely to be elicited 
by abstract, ambiguous, open-ended stim- 
uli (such as the general question posed to 
the quoted subjects at the beginning of this 
chapter, or even Rorschach inkblots; John- 
ston & Holzman, 1979) than by specific, 
closed-ended prompts. In other words, the 
fewer structural demands and intermediate 
goal states provided explicitly, the more dif- 
ficult it is to practice cognitive control. Un- 
der these circumstances, not only is spe- 
cific contextual information either never en- 
coded or lost from active maintenance, but 
the context processing module loses the con- 
comitant ability to inhibit the activation of 
competing pieces of information, exposing 
the system to increased memory retrieval 
interference (Anderson & Spellman, 1995) 
and subsequent loss of goal orientation in 
produced speech. 


CAPACITY ALLOCATION 


Given this continued focus on controlled 
processing as critical to information process- 
ing abnormalities related to formal thought 
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cation of working memory capacity, a pro- 
cess shown to involve activation of dorso- 
lateral prefrontal cortex as well as more 
modality-specific regions of posterior cortex 
(e.g., Garavan et al., 2000), as well as avail- 
ability of free capacity, which appears to 
be reflected in the activity of dorso- (Cal- 
licott et al., 1999) and ventrolateral pre- 
frontal cortex (Rypma, Berger, & D’Espasito 
2002). Numerous studies (e.g., Docherty & 
Gordinier, 1999; Harvey & Pedley, 1989; 
Nuechterlein et al., 1986) have found corre- 
lational evidence of a relationship between 
working-memory capacity and aspects of 
formal thought disorder. Attempting to 
clarify the direction of this relationship, 
Barch and Berenbaum (1994) report that, 
among nonill subjects, reduction in overall 
processing capacity (achieved through a 
dual-task manipulation) is associated with 
decreases in verbosity and syntactic com- 
plexity, which are verbal phenomena in- 
cluded in formal thought disorder — particu- 
larly “negative thought disorder” (Andreasen 
& Grove, 1986). Melinder and Barch (2003) 
extend this approach to include psychotic 
patients, showing that they, too, manifest 
increased negative thought disorder with 
decreasing availability of working-memory 
capacity. These results are particularly note- 
worthy because the investigators were able 
to demonstrate that reduced processing ca- 
pacity can actually cause speech to become 
disordered rather than to show a correla- 
tion between reduced processing capacity 
and thought disorder. Indeed, this repre- 
sents one instance out of many in which 
schizophrenia research has shown working- 
memory capacity to act as a bottleneck, 
limiting the production or implementation 
of abstract ideas (eg., Glahn et al., 2000; 
Silver et al., 2003). 


SELF-MONITORING 
Additionally, inspired by Levelt’s (1989) as- 
sertion that the production of nondisordered 
speech requires the speaker to monitor his 
or her own speech, and consistent with ev- 
idence that schizophrenia patients show an 


neous behaviors (e.g., Malenka, et al., 1982) 
and that patients with formal thought dis- 
order demonstrate significant impairment 
in self-monitoring of motor behavior (e.g., 
Kircher & Leube, 2003), Barch and Beren- 
baum (1996) administered a task requiring 
patients to read separate word lists and then, 
later, to recall whether presented words were 
read aloud or silently or were novel to the 
testing phase of the study. Patients who 
demonstrated worse performance on. this 
task tended to produce a greater number of 
verbal derailments (i.e., switching tangen- 
tially between topics of discussion) in the 
independent speech sample, suggesting that, 
whereas amount and content of disordered 
speech are strongly affected by working- 
memory capacity available, the coherence 
and goal directedness of speech are influ- 
enced to a great degree by contextual pro- 
cessing and self-monitoring ability. 


INTEGRATION OF COGNITIVE DEFICITS 
CONTRIBUTING TO FORMAL THOUGHT DISORDER 
Hemsley and Gray also argue for impair 
ment in self-monitoring among schizophre- 
nia patients, proposing that disruptions in 
this capability result in a failure to gen- 
erate match signals and a consequential 
increase in the extent of controlled, ef- 
fortful processing engaged. Given that this 
repeated failure results in a reduction in 
availability of online processing resources 
concomitant with the shift from automatic 
to more controlled functioning, it should 
lead to a change in the manner in which in- 
formation is retrieved from semantic mem- 
ory (Badre & Wagner, 2002). Specifically, 
the retrieval of target information should be 
biased by the activation as well as by the 
inhibition of nontarget information (Neely, 
1977) consistent with the notion of cog- 
nitive control. 

An investigation by Titone, Levy, and 
Holzman (2000) provides further empiri- 
cal support for the presence and opera- 
tion of these pathological semantic memory 
retrieval and executive functions contribut- 
ing to formal thought disorder. The au- 
thors reported the results of a cross-modal 
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meanings of otherwise semantically ambigu- 
ous words were biased either moderately or 
strongly by the context of a preceding sen- 
tence. Experimental parameters were opti- 
mized to increase the likelihood that more 
controlled retrieval of semantic information 
would be utilized. Schizophrenia patients 
showed a pattern of priming identical to con- 
trols in the strong contextual bias condition 
but exhibited a greater degree of priming in 
the moderate contextual bias condition (i.e., 
patients showed priming effects for both rel- 
atively dominant and relatively subordinate 
meanings, whereas controls showed priming 
facilitation only for subordinate meanings). 
The authors point out that retrieval of a par- 
ticular meaning of a word requires not only 
activation of the word within a semantic net- 
work but also inhibition of nearby, less rel- 
evant meanings. Patients were able to per- 
form this selection process normally when 
strong contextual bias was present, but when 
this influence was more subtle, the patients’ 
degraded, retrieval-related inhibitory mech- 
anism failed to filter out alternate meanings, 
creating interference with the most imme- 
diately relevant meaning. 

Therefore, to the extent that the study 
indeed engaged controlled processing mech- 
anisms (and consequently did not rely en- 
tirely on the automatic spread of activation 
in a semantic network), the results support 
the hypothesis that disordered speech results 
from disrupted executive-assisted seman- 
tic memory retrieval mechanisms involv- 
ing both abnormal activation-based retrieval 
of information from semantic memory and 
impaired executive function involving re- 
duced inhibition of irrelevant, noncontex- 
tual information. Additionally, recognizing 
this possibility, Kerns and Berenbaum (2002) 
call for more direct testing of hypotheses in- 
volving a primarily inhibitory deficit funda- 
mental to formal thought disorder. 


Ex cogito, Dementia 


As the foregoing discussion illustrates, cur- 
rent cognitive models of thought disorder 
have many merits, not the least of which is 


data in a variety of experimental cogni- 
tive tasks. In addition, these models con- 
verge with descriptive analyses of the expe- 
rience of thought disorder in patients with 
psychotic disorders. Yet the parsimony that 
these models gain in attributing context- 
processing deficits in thought-disordered 
patients to a disturbance in a particular pro- 
cessing component (i.e., either a disruption 
in short-term representations of stimulus 
context or in the integration of current con- 
textual information with memories of prior 
stimulus contexts) also leaves them vulnera- 
ble to refutation inasmuch as disturbances in 
other (or multiple) processing components 
of the complex, integrated circuitry medi- 
ating willed behavior could account equally 
well for a wide variety of thought-disordered 
phenomena. That is to say, demonstrating 
that a particular neurocognitive impairment 
could account for a particular behavioral ab- 
normality does not necessarily demonstrate 
that the impairment does cause the abnormal 
behavior to occur. 

Indeed, as mentioned in the Introduc- 
tion, more than four decades of intensive 
neuroscientific investigations have failed to 
identify conclusively a single defining lesion 
in patients with schizophrenia or other 
forms of psychosis. Rather, as discussed in 
more detail subsequently, these syndromes 
manifest with deficits to many neural sys- 
tems (eg., cortico-cortical, fronto-striatal, 
temporo-limbic) across several levels of 
analysis (e.g., alterations in gray matter 
volume, dendritic arborization in cortical 
neurons, and neurotransmitter receptor dis- 
tributions). In light of this complexity, we 
attempt to apply a relatively new analytical 
framework that has become the dominant 
paradigm in psychopathology research — 
that is, the endophenotype approach 
(Gottesman & Gould, 2003) — to theoretical 
accounts of thought disorder. The basic 
premise of the endophenotype approach 
is that a given clinical syndrome such as 
schizophrenia is composed of multiple neu- 
rocognitive trait deficits, each of which may 
be determined by at least partially indepen- 
dent mechanisms. A major consequence of 
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be necessary but not sufficient for the pheno- 
typic manifestation of a syndrome; thus, the 
trait deficit will be shared by individuals with 
a vulnerability to the syndrome regardless of 
whether they manifest the syndrome phe- 
notypically. Other deficits may be specific 
to individuals who manifest the syndrome 
phenotypically; these latter deficits may 
thus potentiate the expression ofa symptom 
in those who carry vulnerability (i.e, those 
who have deficits in other neurocognitive 
domains that are necessary but not sufficient 
for overt disease expression). To develop 
this framework further in the context of a 
discussion of thought disorder, it will first 
be useful to explicate a number of facts 
about the genetic epidemiology and clinical 
neuroscience of schizophrenia. 


The Genetic Epidemiology 
of Schizophrenia 


Although we are aware of only one study re- 
porting on the heritability of formal thought 
disorder itself (Gambini et al., 1997), a great 
deal of evidence is available demonstrating 
that genetic factors contribute substantially 
to the development of schizophrenia, ac- 
counting for about 80% of the risk of de- 
veloping the disorder. The transmission pat- 
tern, however, is complex, involving at least 
several different genes as well as environ- 
mental factors (Cannon et al., 1998; Tsuang, 
Stone, & Faraone, 1999; Tsuang & Faraone, 
1999). One consequence of the complex- 
ity of the inheritance pattern in schizophre- 
nia is that an individual may carry some de- 
gree of genetic predisposition to the illness 
without expressing it phenotypically — or at 
least without expressing it to a degree severe 
enough to meet diagnostic criteria. Stated 
differently, only a subset of genetically vul- 
nerable individuals actually develops a psy- 
chotic disorder. For many with such a genetic 
predisposition, an environmental contribu- 
tion (to which genetically predisposed indi- 
viduals might be differentially sensitive) to 
development of a psychotic disorder is also 
required. Among the environmental factors 
that may be involved, prenatal and perinatal 


with fetal hypoxia or oxygen deprivation, 
are robustly associated with an increased 
risk for schizophrenia. Complications asso- 
ciated with fetal hypoxia are also of inter- 
est because fetal oxygen deprivation repre- 
sents a plausible mechanism for explaining 
much of the structural pathology of the brain 
detected in neuroimaging studies of adult 
schizophrenia patients (Cannon, 1997). 
Applying the conclusion that such ge- 
netic and environmental influences com- 
bine (additively or interactively) to deter- 
mine an individual’s risk for expressing a 
psychotic disorder to the study of neu- 
rocognitive traits helps demonstrate which 
such traits are likely necessary, but not suf- 
ficient, for the expression of a psychosis 
phenotype (or to the expression of any 
phenotype, including specific symptoms, for 
example). Specifically, deficits related en- 
tirely to the genetic diathesis for develop- 
ing the given phenotype may be necessary 
but clearly are not sufficient for the mani- 
festation of that phenotype. This endophe- 
notype should be present in any individual 
carrying the genetic vulnerability. Conse- 
quently, if one member of a set of monozy- 
gotic twins (who, by definition, have identi- 
cal genomes) displays a vulnerability-specific 
trait, the other must as well. Additionally, 
any trait not shared by both monozygotic 
twins must result to some degree from the 
influence of unique environmental events. 


Neural System Abnormalities 
in Schizophrenia 


Although neither the specific neurobiolog- 
ical processes associated with the expres- 
sion of formal thought disorder nor those 
associated with psychosis in general have 
been definitively isolated, disturbances in 
prefrontal and temporo-limbic systems and 
their interconnections are likely to play criti- 
cal roles in both (Cohen & Servan-Schreiber, 
1992; Grace & Moore, 1998; Gray et al., 
1991). The prefrontal cortex is thought 
to support higher-order cognitive processes 
such as working memory, the strategic allo- 
cation of attention, reasoning, planning, and 
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Rakic, 1995; Kane & Engle, 2002; Miller & 
Cohen, 2001). The medial temporal lobe 
structures (i.e., hippocampus, amygdala) 
and adjacent temporal cortex are involved in 
learning and recall of episodic information, 
emotion (especially the amygdala), and cer- 
tain aspects of language processing (Squire 
& Zola, 1996). 

Neuropsychological studies have shown 
that, against a background of generalized 
information processing impairment, schi- 
zophrenia patients manifest profound 
deficits in the areas of long-term and work- 
ing memory (Cannon et al., 2000; Saykin 
et al., 1994). These deficits appear not to 
be merely secondary effects of impaired 
attention, disease chronicity, or medica- 
tion exposure (Cirillo & Seidman, 2003). 
Such findings have been corroborated by 
evidence of abnormal physiologic activity 
(i.e, altered blood flow) in prefrontal and 
temporal lobe regions in patients with 
schizophrenia during performance of tests 
assessing these same domains of functioning 
(Berman et al., 1992; Callicott et al., 1998; 
Heckers et al., 1998; Yurgelun-Todd et al., 
1996). At the structural anatomical level, 
schizophrenia patients show a variety of 
volumetric changes throughout the brain, 
including reduced cortical, hippocampal, 
and thalamic volumes (Pfefferbaum & 
Marsh, i995). Recent neuroimaging work 
indicates a relatively greater degree of 
reduction in frontal and temporal cortical 
volumes compared with posterior cortical 
volumes (Cannon et al., 1998). 


Prefrontal Cortex and Working-Memory 
Deficits 


Several lines of evidence suggest that 
working-memory deficits and associated ab- 
normalities in prefrontal cortical structure 
and function are reflective of an inherited 
diathesis to schizophrenia. In a Finnish twin 
sample, we found that impaired perfor- 
mance on tests of spatial working-memory 
capacity and structural abnormalities in po- 
lar and dorsolateral prefrontal regions var- 
ied in a dose-dependent fashion with degree 


non et al., 2000; Cannon et al., 2002; Glahn 
et al., 2002). Interestingly, global and dor- 
solateral prefrontal volumetric deficits have 
been found to correlate with performance 
deficits on tests sensitive to diverse working- 
memory processes (Maher et al., 1995; 
Seidman et al., 1994). The nature of the 
pathological mechanism underlying these 
correlations is not necessarily obvious, how- 
ever. Rather than a loss of neurons or in- 
terneurons, it has been suggested that gross 
gray matter volume decrements reflect a 
reduction of interneuronal neuropil — the 
space between neural cells consisting largely 
of neurons, dendrites, and axons -— in the pre- 
frontal region in patients with schizophre- 
nia and result in impaired working-memory 
functioning through hypoactive dopamin- 
ergic modulation of pyramidal cell activity 
(Goldman-Rakic & Selemon, 1997). Rather 
than subcortical dopaminergic dysregula- 
tion, in this case, dopamine would be acting 
within the cortex (although affecting a dis- 
tinct set of receptors). This prediction has 
been supported by a position emission to- 
mography investigation that found signifi- 
cantly decreased dopamine receptor binding 
in the prefrontal cortex of schizophrenia 
patients (Okubo et al., 1997). Notably, 
dopamine receptor reduction predicted cer- 
tain types of symptoms, as well as working- 
memory impairment (but also see Abi- 
Dargham et al., 2002). It is also of interest 
in this context that treatment with med- 
ication modulating cortical dopamine levy- 
els is associated with normalization of 
blood flow in the prefrontal cortex and in- 
creased behavioral accuracy during perfor- 
mance of a working-memory test (Honey & 
Andrew, 1999). 

Given that abnormalities of working 
memory and prefrontal structure and func- 
tion are associated with genetic liability to 
schizophrenia, it should be possible to iden- 
tify specific genes that underlie these dis- 
turbances, especially in light of accumulat- 
ing evidence of physiological abnormality. 
Weinberger and colleagues have reported ev- 
idence of one such genetic influence — the 
MET/VAL polymorphism of the COMT 


COGNITIVE AND NEUROSCIENCE ASPECTS OF THOUGHT DISORDER 513 


gene (loRrédestetéthynitiaa i4ahiianaticomatients and some of their first-degree rela- 


VAL alleles promoting more rapid break- 
down of synaptic dopamine, leading to 
prefrontal hypofunction in patients with 
schizophrenia (Egan et al., 2001). We have 
been interested in another potential sus- 
ceptibility locus that may affect prefrontal 
function in schizophrenia — this one on 
chromosome 1. 

Inspired by independent reports of a lo- 
cus of susceptibility within a specific region 
on chromosome 1 (Ekelund et al., 2000; Mil- 
lar et al., 2000; St. Clair et al., 1990), we 
performed linkage and association analyses 
across the chromosome 1 region of interest 
using quantitative neuropsychological mea- 
sures of liability in our sample of twins dis- 
cordant for schizophrenia (Gasperoni et al., 
2003). Analyses revealed that the Visual 
Span subtest of the Wechsler Memory Scale, 
an indicator of spatial working-memory 
function, was significantly and uniquely sen- 
sitive to allelic varation of a gene within a 
highly specific portion of the chromosome — 
very likely to be the DISC gene. The DNA 
sequence of the DISCi gene is most ho- 
mologous to proteins involved in axon guid- 
ance, synaptogenesis, and intracellular ax- 
onal and dendritic transport. Recently, the 
protein was shown to promote neurite out- 
growth (Ozeki et al., 2003). This function 
may help explain the reductions in neuropil 
volume observed in postmortem studies of 
schizophrenia patients. 

Together, these findings strongly impli- 
cate genetic factors as playing a role in 
the abnormalities of prefrontal cortex and 
working memory in schizophrenia. Because 
deficits on tests sensitive to working mem- 
ory have also been observed in children at 
elevated genetic risk (Cosway et al., 2000), 
it is tempting to conclude that disturbances 
in the prefrontal cortex in schizophrenia 
are reflective of an inherited vulnerability 
to the disorder that is present from early in 
life. Nevertheless, patients with schizophre- 
nia have been found to show even greater 
disturbances in dorsolateral prefrontal cor- 
tex function and structure than their nonill 
monozygotic twins (Cannon et al., 2002). 
Thus, although genetic factors may cause 


tives to share a certain degree of compromise 
in prefrontal cortical systems, nongenetic, 
disease-specific influences cause the dorso- 
lateral prefrontal cortex to be further deviant 
in the patients. 


TEMPORAL LOBE AND EPISODIC MEMORY DEFICITS 


Several microscopic abnormalities of the 
hippocampus have been documented in 
schizophrenia, including alterations in neu- 
ronal density (Falkai & Bogerts, 1986; Hut- 
tenlocher, 1979; Jeste & Lohr, 1989; Zaidel, 
Esiri, & Harrison, 1997), size (Arnold, 2000; 
Benes, Sorensen, & Bird, 1991), and orienta- 
tion (Conrad et al., 1991; Conrad & Scheibel, 
1987; Kovelman & Scheibel, 1984). These 
hippocampal volume decrements appear to 
be present at disease onset (Bilder et al., 
1995; Velakoulis et al., 1999) and also 
appear to be present to some degree in 
healthy biological relatives of schizophrenia 
patients, suggesting hippocampal volume is 
related to the genetic diathesis for develop- 
ing schizophrenia (Lawrie et al., 1999; Sei- 
dman et al., 1999; Seidman et al., 2002). 
Postmortem and magnetic resonance imag- 
ing (MRI) studies of schizophrenia patients, 
however, have reported positive correlations 
between hippocampal volume and age at 
onset (Bogerts et al., 1990; Dauphinais et 
al., 1990; Stefanis et al., 1999; Van Erp 
et al., 2002), suggesting a relationship be- 
tween hippocampal volume and the dis- 
ease process, which complicates any simple 
interpretation. 

From a neurocognitive perspective, im- 
paired declarative memory processes that 
depend on the integrity of the hippocam- 
pus (Faraone et al., 2000) have been re- 
ported in both high-risk adolescents (Byrne 
et al., 1999) and nonpsychotic relatives 
of schizophrenia patients (Cannon et al., 
1994), suggesting they derive, in part, from 
an inherited genotype. However, because 
long-term memory deficits are specifically 
more pronounced in patients compared with 
their own healthy monozygotic twins, non- 
genetic, disease-specific factors must also be 
involved (Cannon et al., 2000). Importantly, 
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tionship between deficits in verbal declara- 
tive memory and smaller hippocampal vol- 
umes in relatives of schizophrenia patients 
(O'Driscoll et al., 2001; Seidman et al., 
2002). Furthermore, initial evidence indi- 
cates that impairment in long-term verbal 
memory and, to a lesser extent, executive 
function is associated with the occurrence 
of psychotic symptoms in subjects thought 
to be at significantly elevated risk for even- 
tually developing a diagnosable psychotic 
disorder such as schizophrenia, suggesting 
that these deficits may mark the pathophys- 
iological processes underlying functional 
deterioration during the earliest phase of dis- 
ease onset (Cosway et al., 2000). 

Given the putative importance of the 
hippocampus to verbal and executive func- 
tion and, therefore, its possible role in pro- 
ducing disordered speech, it is of interest 
to revisit the issue of genetic versus envi- 
ronmental contributions to hippocampal in- 
tegrity. Compared with other parts of the 
brain, the hippocampus is acutely vulnera- 
ble to hypoxic-ischemic damage (Vargha- 
Khadem et al., 1997; Zola & Squire, 2001) - 
that is, insult temporarily depriving neural 
cells of oxygen. In monozygotic twins discor- 
dant for schizophrenia, relatively reduced 
hippocampal volume in the ill twin was 
significantly related to the presence of la- 
bor or delivery complications and to pro- 
longed labor, which are both risk factors 
associated with fetal oxygen deprivation 
(McNeil et al., 2000). We have previously 
found, in a Helsinki birth cohort, that 
schizophrenia patients who experienced 
fetal hypoxia have smaller hippocampal 
volumes than in those who did not — a 
difference not noted within unaffected sib- 
lings and healthy comparison subjects (Van 
Erp et al., 2002). At the same time, hip- 
pocampal volume differences occurred in 
a stepwise fashion with increase in genetic 
vulnerability for developing schizophrenia 
(consistent with the findings of Seidman 
et al., 2002), suggesting that, in patients 
with schizophrenia spectrum disorders, hip- 
pocampal volume is influenced in part by 
schizophrenia susceptibility genes and an in- 


fetal hypoxia. Together, these findings indi- 
cate that, whereas hippocampal volume in 
healthy subjects is under substantial genetic 
control, hippocampal volume in schizophre- 
nia patients and their relatives appears to be 
influenced to a greater extent by unique and 
shared environmental factors (Van Erp et al., 
in press). 


Integrating Cognitive Models 
and Endophenotypes 


It appears possible to unify components of 
the two cognitive models of disrupted infor- 
mation processing in schizophrenia patients 
and the findings related specifically to formal 
thought disorder reviewed in the first part of 
this chapter with the research on neurocog- 
nitive endophenotypes in schizophrenia just 
summarized. At the cognitive level of anal- 
ysis, two mechanisms appear to be neces- 
sary for the expression of formal thought 
disorder: an executive, online processing 
system responsible for encoding, maintain- 
ing, and updating of goal-related informa- 
tion (context information in Cohen’s and 
Braver’s model) and an integrated system in- 
volving the retrieval of information from se- 
mantic memory and its fluid integration into 
verbal behavior (i.e., the key component of 
Hemsley’s and Gray’s model). 

In terms of the endophenotype frame- 
work described previously, individuals at el- 
evated genetic risk but not expressing the 
schizophrenia phenotype show mildly im- 
paired functioning of executive systems and 
related working memory and attention com- 
ponents. These executive processing deficits 
therefore appear to be associated with the 
diathesis necessary, but not sufficient, for the 
development of thought disorder. Beyond 
this diathesis, the abnormal interaction of 
executive and semantic memory systems — 
likely in service of controlled retrieval of 
information and its integration into ongo- 
ing speech — is associated with a psychosis- 
specific factor itself related to both genetic 
vulnerability and exposure to environmen- 
tal risk factors. Individuals with schizophre- 
nia and their unaffected twins show a 
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structural and functional abnormality — 
somewhat greater in severity in the patients. 
Patients and their relatives additionally show 
temporal lobe abnormalities; however, the 
degree of difference in temporal lobe abnor 
mality between schizophrenia patients and 
genetically vulnerable individuals is signifi- 
cantly larger than the corresponding differ- 
ence in prefrontal abnormality. 

Taken together, these results suggest that 
mild impairment in prefrontal cortex and as- 
sociated impairment of the functioning of 
online cognitive processing systems (i.e., ex- 
ecutive functions, including working mem- 
ory and selective attention) constitute a 
necessary but not sufficient (i.e., contribut- 
ing) cause of thought disorder, which itself 
derives from a genetic diathesis to develop- 
ing a psychotic disorder such as schizophre- 
nia. An additional factor related etiologi- 
cally to exposure to an environmental insult 
interacting with genetic predisposition and 
also necessary but not sufficient for the ex- 
pression of schizophrenia involves disrupted 
interaction between an executive, online 
processing system and a semantic memory 
storage and selection system loosely map- 
ping onto schizophrenia patients’ prefrontal 
and temporal lobe abnormalities, respec- 
tively. That is, abnormalities in both the 
prefrontal or executive-related circuitry and 
in the temporal lobe circuitry (i.e, me- 
dial temporal lobe for episodic memory 
and nearby middle temporal gyrus for se- 
mantic memory; Kircher et al., 2001) may 
be required to account for the full range 
of thought disorder observed in patients 
with schizophrenia, whereas only the former 
may be required to account for the subtler 
thought disturbances seen in genetically vul- 
nerable individuals who do not manifest the 
full schizophrenia syndrome phenotypically 
(perhaps the case with the interviewee in the 
second quote at the beginning of this chap- 
ter). Of course, it is also possible that severity 
of phenotypic thought disorder scales with 
severity of compromise of both components 
of the system rather than to their conjunc- 
tion per se. Further work is needed to segre- 
gate these two possibilities. 


nitive, genetic, and neural pathologies of 
thought disorder in general, and schizophre- 
nia specifically, has necessarily taken on a 
complex, interactive structure. As we have 
seen, cognitive models designed to predict 
particular behavioral outcomes can, in fact, 
help researchers to understand the func- 
tional correlates of anatomical abnormalities 
measured between genetically defined risk 
groups. Similar permutations involving these 
and numerous other levels of analysis equip 
us with heuristics that guide our struggle to 
unravel the complexities of neuropsychiatric 
phenomena such as formal thought disorder. 
We have attempted to present such a heuris- 
tic framework based on links we have ob- 
served between bodies of research into the 
pathology of thought disorder; some of these 
links cross between levels of analysis, ideally 
helping us to map genetic, neurological, and 
cognitive systems onto each other. 


Future Directions 


Along the way to accomplishing this inte- 
grative goal, a great deal more work needs 
to be done. Ideally, the parsing of formal 
thought disorder into necessary and suffi- 
cient functional components — such as the 
work being carried out on the level of cogni- 
tive specification by Barch, Berenbaum, and 
colleagues — will be complemented by fur- 
ther study of the physiological and genetic 
variations associated with the production of 
abnormal speech. 

This line of work will likely be facili- 
tated by cognitive neuroscience’s growing 
ability to study the activity of particular 
brain mechanisms during the production of 
speech, overcoming previously prohibitive 
practical obstacles caused by movement arti- 
facts detrimental to work utilizing functional 
MRI (Barch et al., 1999) and EEG (eg., 
Ford et al., 2002). Prior to these method- 
ological advances, only speech production 
studies employing covert vocalization were 
practical; however, these investigations typ- 
ically fall short of describing compellingly 
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a phenomenon measured entirely in terms 
of overt speech production. 

Additional progress in the study of 
thought disorder involves application of 
paradigms from the emerging field of so- 
cial cognitive neuroscience (e.g., Adolphs, 
2003; Wood, 2003) to the study of inter 
personal deficits in schizophrenia (e.g., Penn 
et al., 2002; Pinkham et al., 2003), includ- 
ing the distinctly interpersonal task of verbal 
communication (Grossman & Harrow, 1996; 
Racenstein et al., 1999). For instance, the 
study of communication deviance (including 
aspects of formal thought disorder) within 
the families of patients with psychotic dis- 
order diagnoses or patients thought to be at 
high risk for developing a psychotic disor 
der has been an area of active research for 
some time (e.g., Docherty, 1995; Sass et al., 
1984; Wahlberg et al., 2000). Applying this 
established framework to the examination 
of neuronal correlates of receptive and pro- 
ductive aspects of intrafamily communica- 
tion — potentially distinct from communi- 
cation with nonfamily individuals because 
of the role of factors such as increased in- 
terpersonal familiarity and less predictable 
affective modulation of cognitive processes 
involved in communication — offers a novel 
perspective with the potential to reinvigo- 
rate this important line of thought disorder 
research. 

Another area of thought disorder research 
deserving continued attention involves the 
study of formal thought disorder in popu- 
lations other than those currently meeting 
diagnostic criteria for a major mental ill- 
ness. Although modern antipsychotic med- 
ications appear to be relatively effective at 
helping psychotic patients organize their 
speech (e.g., Wirshing et al., 1999), signif- 
icant levels of thought disorder often appear 
noticeable in groups of patients who would 
not typically be treated with therapeutic 
doses of such medications (Andreasen & 
Grove, 1986). For instance, in a sample of 
patients judged to be at significantly elevated 
risk for developing a psychotic disorder (in 
part because they were displaying some psy- 
chotic symptoms but at a level of intensity 


we found a significant level of formal 
thought disorder — interestingly, coupled 
with significant impairment of selective at- 
tention — which remitted with relatively 
low-dose antipsychotic pharmacotherapy 
(Cannon et al., 2002). These and similar 
findings raise interesting questions regarding 
the potential utility of formal thought dis- 
order as a prodromal indicator of psychosis 
as well as the potential benefits of symptom- 
based treatment outside the context of a ma- 
jor psychiatric diagnosis. 
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Notes 


1. Delusions and similar disorders of thought con- 
tent are not the central focus of this chapter 
but might be of interest to cognitive psychol- 
ogists. For instance, the study of development 
and maintenance of delusions is an area of ac- 
tive research. See Bermudez (2001), Garety 
and Freeman (1999), Gold and Hohwy (2000), 
and/or Maher (2002) for debate over whether 
or not delusions represent products of flawed 
inferential reasoning. 

2. Bleuler is likely also the source of the distinc- 
tion between thought form and content dis- 
cussed earlier because he drew the distinction 
(Bleluer, 1911/1950) between what he labeled 
“fundamental symptoms,” including, but not 
limited to, the loosening of ideational associa- 
tions, and “accessory symptoms,” including hal- 
lucinations and delusions. 

3. Working in parallel with Cohen, Braver, and 
colleagues, Kane and Engle (2002) and their 
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formation maintenance and behavioral re- 
sponse selection entirely compatible with Co- 
hen, Braver, and colleagues’ model. Rather 
than referring to a context-processing mod- 
ule, however, Kane and Engle (2002) deem the 
same cognitive mechanism “controlled atten- 
tion” and survey implications of applying the 
model to patients with frontal lobe lesions, al- 
though this mechanism is certainly relevant to 
psychosis. 

4. In fact, Bleuler (1911/1951, as discussed in 
Chapman & Chapman, 1973), writing nearly 
one century ago, argued that formal thought 
disorder in schizophrenia patients involves a 
failure to utilize context information to bind 
ideational elements together in logical se- 
quence. However he blamed this disorder on 
the breaking of the “associative thread” link- 
ing a given goal to the appropriate contextual 
influence, rather than considering the goal con- 
text itself 

5. See Andreasen et al. (1999) for the descrip- 
tion of an alternate, circuit-based comparator 
mechanism. 
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CHAPTER 22 


Development of Thinking 


Graeme S. Halford 


Itis appropriate to begin a review of research 
on cognitive development with the work of 
pioneering researchers such as Luria, Piaget, 
and Vygotsky, who provided much of the 
conceptual foundation on which later con- 
tributions were built. We will begin with a 
survey of this legacy, then proceed to more 
contemporary theories, and finally consider 
a number of key empirical research topics. 


Early Influences 


The single most powerful influence on past 
research into the development of thinking 
has been the work of Piaget and his col- 
laborators (Inhelder & Piaget, 1958, 1964; 
Piaget, 1950, 1952, 1953, 1957, 1970), but 
the influence of Vygotsky (1962) appears to 
be increasing with time. The work of Luria 
(1976) deservedly had a major influence on 
early cognitive development research, but 
not primarily devoted to thinking. In this 
chapter, I consider Piaget first, followed by 
Vygotsky, and then the common ground 
between them. 


Two ideas that were central to Pi- 
aget’s conception of thought were struc- 
ture and self-regulation, both of which were 
also held by the Gestalt school. How- 
ever, a distinguishing feature of Piaget’s 
theory was that it was based on logico- 
mathematical concepts, including function, 
operation, group, and lattice. Although 
he did not claim that logic defined the 
laws of thought (cf Boole, 1854/1951), he 
used modified logics or “psycho-logics” to 
model thought. 

Piaget’s very extensive empirical investi- 
gations into the development of infants’ and 
children’s cognitions were conceptualized 
by a succession of distinct logics, which have 
come to be known as “stages” of cognitive de- 
velopment. The first was the sensorimotor 
stage, lasting from birth to about one-and- 
a-half to two years, characterized by struc- 
tured, organized activity but not thought. 
During this stage, a structure of actions be- 
came elaborated into a mathematical group, 
meaning that an integrated, self-regulating 
system of actions developed. Piaget believed 
that the concept of objects as real and 
permanent emerged as this structure was 
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from approximately two to seven years, and 
during this time semiotic or symbolic func- 
tions were developed, including play, draw- 
ing, imagery, and language. Thought at this 
stage was conceptualized in terms of what 
Piaget called “function logic,” the essential 
idea of which is a representation of a link 
between two variables. At the concrete op- 
erational stage, lasting from eight to about 
fourteen years, thought was conceptualized 
in terms of what Piaget called “groupings,” 
which were equivalent to the mathematical 
concept of a groupoid, meaning a set with 
a single binary operation (Sheppard, 1978). 
The essential idea here is the ability to com- 
pose classes, sets, relations, or functions, into 
integrated systems (Halford, 1982). Con- 
cepts such as conservation (invariance of 
quantity, number, weight, and volume), se- 
riation or ordering of objects, transitive in- 
ference, classification, and spatial perspec- 
tives emerge as a result of the more elaborate 
thought structures that develop during this 
time. At the formal operational stage, begin- 
ning in adolescence, the ability to compose 
concrete operations into higher-level struc- 
tures emerges with the result that thought 
has greater autonomy and flexibility. 

Cognitive development depended, accor- 
ding to Piaget, on assimilation of experi- 
ence to cognitive structures with accom- 
modation of the structure to the new 
information. The combination of assimila- 
tion and accommodation amounts to a pro- 
cess of self-regulation that Piaget termed 
“equilibration.” He rejected the association- 
ist learning theories of the time, although his 
conceptions in many ways anticipated mod- 
ern conceptions of information processing 
and dynamic systems. 

The work of the Piagetian school has been 
one of the most controversial topics in the 
field, and claims that Piaget was wrong in 
many important respects are not uncommon 
(Bjorklund, 1997; Gopnik, 1996). The fol- 
lowing points are intended to help provide 
a balanced account of this issue. First, Pi- 
aget’s empirical findings have been widely 
replicated (Modgil, 1974; Sigel & Hooper, 
1968). That is, children have been found 


he used. The major challenges to his find- 
ings have been based on different meth- 
ods of assessment, the claim being that his 
methods underestimated the cognitive ca- 
pabilities of young children (Baillargeon, 
1995; Bryant, 1972; Bryant & Trabasso, 1971; 
Donaldson, 1971; Gelman, 1972). How- 
ever, these claims also have been subject 
to controversy. Miller (1976) showed that 
nonverbal assessments did not demonstrate 
improved reasoning if the cognitive skills 
employed were taken into account, and a 
similar point was made about subsequent re- 
search by Halford (i989). However, there 
were also some hundreds of training stud- 
ies, reviewed by Field (1987) and Halford 
(1982), that were sometimes interpreted as 
showing that cognitive development could 
be accelerated and depended more on ex- 
perience than on development of thought 
structures. The stage concept has also been 
heavily criticized for theoretical inadequa- 
cies (Brainerd, 1978) and for lack of empir- 
ical support (Bruner, Olver, & Greenfield, 
1966). In particular, acquisition tends to be 
gradual and experience-based rather than 
sudden or “stage-like,” and the concurrence 
between acquisitions at the same stage of- 
ten has not been as close as Piagetian the- 
ory might be taken to imply. However, there 
have also been some spirited defenses of Pi- 
aget (Beilin, 1992; Lourenco & Machado, 
1996), and Smith (2002) has given a con- 
temporary account of Piagetian theory. See 
also the special issue of Cognitive Develop- 
ment edited by Bryant (2002) on “Construc- 
tivism Today.” 

The underlying problem here seems to 
have been that it is difficult to operational- 
ize Piagetian concepts in the methodologies 
that evolved in Anglo-Saxon psychology to 
about 1970. His conceptions have been more 
compatible with methodologies that devel- 
oped after the “cognitive revolution,” includ- 
ing information processing and dynamic sys- 
tems theories. In the next section I consider 
alternative ways of conceptualizing the de- 
velopment of children’s thought. 

The work of Vygotsky (1962) was the 
other major influence on research into the 
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bution is becoming increasingly influential 
even today (Lloyd & Fernyhough, 1999). 
Three of Vygotsky’s most important contri- 
butions were his ideas on the relation be- 
tween thought and language, his emphasis 
on the role of culture in the development 
of thinking, and the zone of proximal de- 
velopment. Early in the history of cognitive 
development research, there was consider- 
able debate as to whether thought depends 
on language development, as implied by 
Bruner, Olver, & Greenfield. (1966), or the 
reverse, as implied by Slobin (1972). Vy- 
gotsky (1962) proposed that thought and 
language have different origins both in evo- 
lution and in development. Language was 
initially social in character, whereas problem 
solving was initially motor. Language and 
thought develop independently for some 
time after infancy; then the young child de- 
velops egocentric speech, the beginning of 
the representational function. Finally, chil- 
dren develop “inner speech,” which serves 
the symbolic function of thought. Vygot- 
sky emphasized the interaction between bi- 
ological maturation and social experience. 
As the child matures, language becomes 
an increasingly important influence on the 
development of thought and is the chief 
means by which culture is absorbed by the 
child. Vygotsky’s concept of the zone of 
proximal development, which means that 
new developments are close to existing cog- 
nitive abilities, is broadly consistent with 
Piaget’s notion that new knowledge is as- 
similated to existing structure. This is part 
of a larger picture in which both Piaget and 
Vygotsky saw cognitive development as an 
active organizing process that tends toward 
an equilibrium with its own internal pro- 
cesses and with the external environment. 
Piaget’s work had greater early influence, 
but the impact of Vygotsky’s work is in- 
creasing at what appears to be an acceler- 
ating rate. Among the many areas in which 
it has been important are the development 
of education theory (Gallimore & Tharp, 
1999) and research on collaborative problem 
solving (Garton, 2004; see also Greenfield, 


Chap. 27.) 


Theory of development of reasoning diversi- 
fied in numerous directions in the latter half 
of the twentieth century and our concep- 
tions of reasoning processes have undergone 
some fundamental changes. Perhaps one of 
the most important is that there is much less 
reliance on logic as a norm of reasoning and 
more emphasis on the interaction between 
reasoning processes and the child’s experi- 
ence. Information processing theories were 
one of the first lines of development follow- 
ing the impact of Piaget and Vygotsky, so it 
is appropriate to consider them first. 


Information Processing Theories 


An attempt to conceptualize development 
of thinking in terms of information pro- 
cessing concepts was made by what be- 
came known as the Neo-Piagetian school 
(Case, 1985, 1992a; Case et al., 1996; Chap- 
man, 1987, 1990; Fischer, 1980; Halford, 
1982, 1993; McLaughlin, 1963; Pascual- 
Leone, 1970; Pascual-Leone & Smith, 1969). 
These models, reviewed in detail by Halford 
(2002), reconceptualize Piaget’s stages in 
terms of the information processing de- 
mands they make. All of them postulate 
that higher information processing capac- 
ity becomes available with development ei- 
ther through maturation (Halford, 1993) or 
increased processing efficiency that leaves 
more capacity available for working mem- 
ory (Case, 1985). Note that these processes 
are not mutually exclusive. Chapman and 
Lindenberger (1959, p. 238) attempted to 
synthesize these theories under the prin- 
ciple that “the total capacity requirement 
of a given form of reasoning is equal to 
the number of operatory variables that are 
assigned values simultaneously in employ- 
ing that form of reasoning in a particular 
task.” 

Other theoretical developments were 
more independent of the Piagetian tradi- 
tion. An important class of theories was 
based on computer simulations first using 
symbolic architectures (Halford, Wilson, & 
McDonald, 1995; Klahr & Wallace, 1976; 
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ral nets (Elman, 1990; McClelland, 1995; 
Shultz, 1991; Shultz, et al, 1995). The 
model of Klahr and Wallace (1976) was con- 
cerned with quantification operators, includ- 
ing subitizing (direct estimation of small 
sets without counting), counting, and esti- 
mation (approximate quantification of large 
sets such as crowds). It was used to model 
conservation or understanding that a quan- 
tity remains invariant despite transforma- 
tions of physical dimensions. In a typical sim- 
ple number conservation task, two rows of 
beads are placed in one-to-one correspon- 
dence. Then one row is transformed (eg., 
by spacing objects more widely and thus 
increasing the length of the row without 
adding any items); then the child is asked 
whether each row still contains the same 
number or whether they are different. Pre- 
conserving children cannot answer this ques- 
tion correctly because they have not learned 
that the transformation leaves number in- 
variant. In the model of Klahr and Wallace 
(1976) the task is performed initially by 
quantifying first one row followed by the 
other in the pretransformed display and then 
comparing the results. The transformed row 
is quantified again after the transformation 
and found to be still the same as the other 
row. With repeated quantification before 
and after a transformation, the rule that pre- 
and post-transformed quantities are equal is 
learned, and the quantification operators are 
no longer employed. (See also Chap. 17 by 
Lovett & Anderson, on production system 
models of thinking.) 

The Q-SOAR model of Simon and Klahr 
(1995) applied Newell’s (1990) SOAR ar- 
chitecture to Gelman’s (1982) study of num- 
ber conservation acquisition. Children are 
shown two equal rows of objects, asked to 
count each row in turn and say how many 
each contains, then to say whether they are 
the same or different. Then one row is trans- 
formed and the preconserving child is un- 
able to say whether they are the same or 
different. This is represented in Q-SOAR 
as an impasse. The model then searches 
for a solution to the problem using the 
quantification procedure of Klahr and Wal- 


model gradually learns to classify the ac- 
tion of spacing out the items as a conserving 
transformation, using the learning mecha- 
nism of the SOAR model, called “chunking,” 
which has been shown to have considerable 
generality. 

Acquisition of transitive inference was 
simulated by the self-modifying produc- 
tion system model of Halford, Smith, et al. 
(1995). Development of transitive inference 
strategies is guided by a concept of order 
based on any representation of an ordered 
set of at least three elements. When no pro- 
duction rule exists for a given problem, the 
model uses analogical mapping and means- 
end analysis to determine the correct an- 
swer; then a production rule is created to 
handle that case. Rules are strengthened or 
weakened by subsequent experiences with 
success or failure. 


Neural Net Models 


Neural net models of thinking are reviewed 
by Doumas and Hummel (Chap. 4), but 
the contribution of neural net models to 
cognitive development is considered here. 
A good way to illustrate neural net mod- 
els of cognitive development is to examine 
McClelland’s (1995) model of children’s un- 
derstanding of the balance scale. The net is 
shown schematically in Figure 22.1 together 
with a balance scale problem. It is a three- 
layered net, which means that activation is 
propagated from the input units to the hid- 
den (middle) layer and then to the output 
layer. There are four sets of five input units 
representing one-to-five weights on pegs one 
to five steps from the fulcrum on both left 
and right sides. The units that are activated 
are shown as black. The activations in the 
input units represent the problem in the 
top of the figure. In the first set of input 
units, representing number of weights on 
the left, unit 3 is activated, coding the three 
weights on the left. Similarly, in the second 
set of input units, representing weights on 
the right, unit 4 is activating, coding four 
weights on the right. Distances are coded 
in a similar way by the two sets of input 
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Figure 22.1. Balance scale model of McClelland (1995). By permission of the author and Oxford 


University Press. 


units on the right. In the first set, unit 3 is 
activated, coding the weights on peg 3 on 
the left, whereas in the second set, unit 2 
is activated, coding weights on peg 2 on the 
right. 

There are four hidden units (shown in the 
middle of the net), two of which compare 
weights and two that compare distances. 
The units that are more highly activated are 
shown as black, although activations would 
be graded, rather than all-or-none. Finally, 
there are the output units that compute the 
balance state. Activation of an output unit 
represents the corresponding side of the bal- 
ance beam going down. If the beam is bal- 
anced, the activations in the output units 
would be equal, which is defined as being 
within 0.3 of each other. 

The operation of the unit can be under- 
stood from the connection weights between 
units, which are shown schematically in Fig- 
ure 22.1 as +\—. The second hidden unit has 
positive connections to all input units rep- 
resenting weight on the right and negative 
connections to all input units representing 
weight on the left (although only a single 
arrow is shown in each case for simplicity). 
This unit is more strongly activated because 


weight on the right is greater than on the 
left. The first hidden unit has the opposite 
pattern of weights and will be more strongly 
activated if weight is greater on the left. The 
second hidden unit also has positive connec- 
tions to the right output unit. Thus, greater 
weight on the right will tend to produce 
greater activation on the right output unit, 
representing a tendency for the right side go- 
ing down. The second pair of hidden units 
compare distances in corresponding fashion. 
The activations of the output units depend 
on activations of hidden units comparing 
both weights and distances. In this case the 
greater weight on the right tends to make 
the right side go down, but this is countered 
by the greater distance on the left; thus, the 
predicted position of the beam will be ap- 
proximately balanced, although, in fact, the 
left side would go down. The network does 
not compute the product of weight and dis- 
tance but compares the influences of weights 
and distances on each side. 

The network was trained by backpropa- 
gation; that is, comparing the network’s out- 
put on each trial with the correct output and 
then adjusting the connection weights to re- 
duce the discrepancy. The training would 
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weights or larger distances having greater 
connection weights to the hidden units. 
Thus, metrics for weight and distance 
emerge as a result of training and are not 
predefined in the net. This is possibly the 
most important property of the model be- 
cause it shows how a structured representa- 
tion can emerge from the process of learn- 
ing to compute input-output functions that 
match those in the environment. 

The model also captures a number of 
crucial developmental results. Its progress 
through training corresponded with the 
course of development as defined by 
Siegler’s (1981) rules. According to Rule I, 
judgments are based on weight, irrespective 
of distance. In Rule II states that distance is 
considered if the weights are first found 
to be equal. Rule III asserts that weight 
and distance are considered but difficulty 
is encountered when weight is greater on 
one side and distance is greater on the 
other. Rule IV (torque rule) involves com- 
paring the product of weight and distance 
on the left side with the product of weight 
and distance on the right side. The model 
also captured the torque difference effect — 
that is, the difference between the product 
of weight and distance on the left (W; x D)) 
and on the right (W, x D,) affects children’s 
performance, because they are more likely 
to recognize that one side will go down if 
torque difference is large even though there 
is no logical basis for this given that even a 
small torque difference will cause one side 
to go down. This is one of many ways in 
which neural net models capture psycholog- 
ical properties of task performance. 

This model computes the balance state 
as a function of weight and distance on left 
and right sides of the balance beam. How- 
ever, understanding the balance beam also 
entails determining weight or distance val- 
ues that will make the beam balance. There 
are effectively five variables here: Wi, Dj, 
W,, D,, and balance state. Complete under- 
standing of the balance scale would include 
being able to determine any variable given 
the other four; that is, compute all five func- 
tions implicated by the balance scale con- 


tions are that, as Marcus (1998 a, 1998b) has 
pointed out, if the model is trained on two or 
three weights on either side, it cannot gener- 
alize to problems with four or five weights. 
Again, however, it would be reasonable to 
expect that children would generalize in this 
way. The conclusions therefore are that the 
model can be trained to compute one func- 
tion implicated by the balance scale, albeit 
under restricted conditions, and that it does 
not fully capture understanding of the con- 
cept but is nevertheless an important step 
forward in our understanding of cognitive 
development because it shows how struc- 
tured representation can emerge. 

The balance scale model by McClelland 
(1995) is a three-layered, or backpropaga- 
tion, net. This type of architecture has been 
used in a great many models, in cognitive 
development and elsewhere. One reason is 
that it can, in principle, compute any input— 
output function. The simple recurrent net 
(Elman, 1990) is an important model in this 
class. In this type of net, activations in the 
hidden units are copied over into context 
units. On the next trial, activations in the 
hidden units are influenced by activations in 
both the input units and the context units. 
The result is that the output of the net 
is influenced by representations on previ- 
ous trials as well as by the current input. 
The net therefore takes account of links be- 
tween events in a sequence. The model was 
trained to predict the next word in a sen- 
tence. Training was based on a large cor- 
pus of sentences by representing each suc- 
cessive word in the input units, and the 
output units were trained to represent the 
next word. Feedback was given concerning 
the accuracy of the output, thereby adjust- 
ing the connection weights to improve the 
model’s prediction. The model learned to 
predict the next word in a sentence and re- 
spected grammatical categories even when 
words in related categories spanned embed- 
ded clauses. Cluster analysis of the hidden 
unit activations showed that words in the 
same grammatical category, such as nouns 
or verbs, tended to have similar activations. 
Semantically similar words, such as those 
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to have similar hidden unit representations. 
Elman (1990) was careful not to predefine 
categories, and the inputs used were orthog- 
onal; thus, no pre-existing similarities were 
supplied to the model. Similarities were 
created in the hidden unit activations that 
reflected the input-output functions the 
model was required to learn. Therefore, 
to the extent that categories developed, 
they are an emergent property of the 
model and one that reflects contingencies in 
the environment. This model, like that of 
McClelland (1995), offers a possible mech- 
anism by which structured representation 
might be acquired. 

The ability of simple recurrent nets to 
predict sequences has been utilized to model 
infants’ expectations of the reappearance of 
occluded objects (Mareschal, Plunkett, & 
Harris, 1995; Munakata et al., 1997) thereby 
simulating infants’ understanding of the ob- 
ject concept (Baillargeon et al., 1990). These 
models are basically consistent with the 
model of Smith et al. (1999). Again, how- 
ever, there have been limitations. Marcus 
(1998a, 1998b) found that the model of Mu- 
nakata et al. (1997) did not generalize to ob- 
jects in new positions on the display. 

The potential of models such as this to 
learn regularities in the environment and ac- 
quire concepts has inspired a whole new 
approach to cognitive development (Elman 
et al., 1996). Elman et al. (1996) see con- 
nectionism as giving more powerful means 
to analyze the gene-environment interac- 
tions that are the basis of development. They 
advocate a form of connectionism that is 
founded in biology, is influenced by devel- 
opmental neuroscience, and that can pro- 
duce neurologically plausible computational 
models. Although they see an undoubted 
role for innateness in cognitive develop- 
ment, they argue some nativist conceptions 
underestimate the potential for new cogni- 
tive forms to emerge from the interaction 
of neural processes. The simple recurrent 
net nicely illustrates how representations 
that respect distinctions between word cat- 
egories emerge from the model’s interaction 
with the environment. 


Cascade correlation models provide a mech- 
anism by which the dimensionality of rep- 
resentations can be increased to handle in- 
creased dimensions in the task. They do this 
by adding units to the hidden layer. The ini- 
tial net has minimal hidden units and some- 
times starts with none. Training takes place 
in two modes. In the first mode, weights are 
adjusted to yield the appropriate output for 
each input. In the second mode, hidden units 
are recruited to increase the accuracy of the 
output. Recruitment is based on correlation 
between a candidate’s activation and the ex- 
isting error of the network. After recruit- 
ment of a hidden unit, training continues 
in the first mode and the system cycles be- 
tween the modes until a learning criterion 
is reached. 

Cascade correlation models have been 
used to model a number of developmen- 
tal phenomena (Shultz, 1991; Shultz et al., 
1995; Sirois & Shultz, 1998). Shultz and col- 
leagues used cascade correlation to model 
the same balance scale problem modeled by 
McClelland (1995). The initial net was sim- 
ilar to that used by McClelland and shown 
in Figure 22.1, but without hidden units. Ini- 
tial training was with problems varying only 
in weight, and the net performed consistent 
with Siegler’s (1981) Rule 1. Once the dis- 
tance variable was introduced, the net re- 
cruited a single hidden unit. It then pro- 
gressed to Rule 2 and higher rules that take 
account of distance, effectively simulating 
the developmental progression in a manner 
similar to McClelland’s model (1995). 


Neural Net Models and Symbolic 
Processes 


Concern that three-layered net models do 
not capture symbolic processes has been 
expressed by Fodor and Pylyshyn (i988). 
Properties that are considered essential by 
Fodor and Pylyshyn (1988) are composition- 
ality and systematicity. The essential idea of 
compositionality is that symbols must re- 
tain their identity and their meaning when 
combined into more complex representa- 
tions. Thus, the cognitive symbols for “dog” 
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combined into the symbol for “happy dog.” 
Prototypes are not necessarily compositional 
in this way (Fodor, 1995). One problem for 
three-layered net models is that the repre- 
sentations in the hidden units do not neces- 
sarily include the components of the input in 
a form that is recognizable to the performer. 
Any structure that exists in the hidden layer 
must be discovered by an external observer 
(the experimenter) using techniques such as 
cluster analysis (Elman, 1990). Representa- 
tions in hidden units are not accessible to 
strategic processes. They are more like im- 
plicit knowledge (Karmiloff-Smith, 1994). 
Systematicity, in essence, means that cog- 
nitive processes are subject to structural 
constraints independent of content. Three- 
layered nets lack strong systematicity (Mar- 
cus, 1998a, 1998b; Phillips, 1994), meaning 
they cannot generalize to an element that has 
not occurred in the same role before, even if 
the element is familiar. Thus, a net trained 
on “John loves Mary” and “Tom loves Jane” 
could generalize to “John loves Jane” but not 
to “Jane loves John” or even to “Mary loves 
John.” Nets of this type learn representa- 
tions that are needed to compute the input— 
output functions on which they are trained, 
but they do not learn abstract relations. 
Although three-layered net models have 
real potential to advance research on cog- 
nitive development (Bray et al., 1997), it 
appears they lack the structural properties 
that have long been regarded as character- 
istic of higher cognition (Chomsky, 1980; 
Humphrey, 1951; Mandler & Mandler, 1964; 
Miller, Galanter, & Pribram, 1960; Newell, 
1990; Piaget, 1950; Wertheimer, 1945). One 
response to this problem (Smolensky, 1988) 
is that neural net models seek to explain 
symbols as emergent properties of more ba- 
sic processes, as was illustrated earlier. A sec- 
ond approach has been to develop symbolic 
neural net models of higher cognitive pro- 
cesses (Doumas & Hummel, Chap. 4; Shas- 
tri & Ajjanagadde, 1993; Smolensky, 1990; 
but see also Halford, Wilson, & Phillips, 
1998). A symbolic connectionist account 
of cognitive development has been given 
by Halford and his collaborators (Halford, 


See Halford (2002) for a summary of this 
approach. 


Dynamic Systems Models 


Dynamic systems models (Fischer & Bidell, 
1998; Fischer & Pare-Blagoev, 2000; van 
Geert, 1991, 1998, 2000) have offered new 
ways to analyze developmental data. A dy- 
namic system is a formal system, the state 
of which depends on its state at a previous 
point in time. The dynamic system model 
of van Geert (1998) was designed around 
principles derived from the work of Piaget 
and Vygotsky and has a number of interest- 
ing properties. It can account for different 
types of cognitive growth, such as slow linear 
increase and sudden discontinuities, within 
the same system. It can also show how a 
complex, self-regulating system can emerge 
from the interaction of a few variables. The 
model was fitted to a number of develop- 
mental data sets, and some important devel- 
opmental phenomena, including conserva- 
tion acquisition, were simulated. Links have 
also been made between dynamic systems 
models and neural net models. 

Dynamic systems models have also 
been linked to other issues. Raijmakers, 
van Koten, and Molenaar (1996) analyzed 
McClelland’s (1995) neural net model of the 
balance scale and found no evidence of the 
flags indicating discontinuities that are found 
in empirical data. They suggest that back- 
propagation models simulate the type of 
stimulus-response associations that are char- 
acteristic of animals and young children but 
do not simulate the rule-governed behavior 
characteristic of older children and adults. In 
many respects, this finding is consistent with 
the analysis of the model presented earlier. 
On the other hand, backpropagation mod- 
els incorporate learning functions that have 
been missing from models of higher cogni- 
tive processes. As we have seen, they show 
how structured representations begin to 
emerge as a result of learning input-output 
functions. 

Although there are acknowledged diff- 


culties with dynamic systems models (van 
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phisticated implementations of important 
developmental theories, including that of 
Piaget and Vygotsky. This does not mean 
that Piaget and Vygotsky are fully vindicated 
by dynamic systems models, but concepts 
such as equilibration and self-regulation, 
which are at the core of their theories, do 
seem to have a new lease on life. Most impor 
tantly, dynamic systems models have poten- 
tial to deepen our understanding of cognitive 
developmental processes. And, as Fischer 
and Pare-Blagoev (2000) point out, there are 
tools based on Lotus 123 or Microsoft Excel 
that make dynamic system modeling more 
accessible. 


Links to Brain Development 


The finding by Thatcher, Walker, and Giu- 
dice (1987) of brain growth spurts that ap- 
peared to correspond to stage transitions in 
cognitive development stimulated consider- 
able interest in the explanatory potential of 
neural maturation. One of the important 
landmarks in infant development is the A 
not-B error: If infants are shown a toy hid- 
den at A several times and allowed to re- 
trieve it and then see it hidden at B, be- 
fore approximately 12 months of age they 
tend to search for it at A. Studies by Di- 
amond (1988) and Goldman-Rakic (1987) 
showing the link between frontal lobe func- 
tion and the A not-B error were important 
stimuli to work on infant brain development. 
Case (1992a, 1992b) and Fischer (1987; Fis- 
cher & Rose, 1996) have drawn interest- 
ing parallels between cognitive development 
and the growth of connections between the 
frontal lobes and other brain regions. Robin 
and Holyoak (1995) and Waltz et al. (1999) 
have also drawn attention to the role of the 
frontal cortex in processing relations of the 
kind described by Halford and his collabo- 
rators (Halford, 1993; Halford, Bain et al., 
1998; Halford, Wilson, & Phillips, 1998). In 
a different context, Rudy, Keith, and Geor- 
gen (1993) present evidence that configu- 
ral learning (e.g., conditional discrimination, 
in which a cue-response link is reversed on 


tion of the hippocampus. 

At a more general level, Quartz and Se- 
jnowski (1997) have argued that synaptic 
growth, axonal arborization, and dendritic 
development play a role in processing ca- 
pacity increase with age. They also point out 
that neural plasticity would cause capacity to 
increase as a function of experience. This im- 
plies that the issue of whether cognitive de- 
velopment depends on capacity, knowledge, 
or both may need to be redefined. It might 
be that cognitive development depends on 
growth of capacity, which is at least partly 
produced by experience. 


Strategy Development 


Problem-solving strategies are important to 
reasoning in children and adults, and much 
of the improvement in children’s reasoning 
can be attributed to development of more 
powerful strategies. It is appropriate there- 
fore that much research has been devoted to 
development of strategies. Following work 
on rule assessment (Briars & Siegler, 1984; 
Siegler, 1981), Siegler and his collaborators 
conducted an extensive study of strategy 
(Siegler, 1999; Siegler & Chen, 1998; Siegler 
& Jenkins, 1989; Siegler & Shipley, 1995; 
Siegler & Shrager, 1954). Two of the mod- 
els were concerned with development of ad- 
dition strategies in young children. When 
asked to add two single-digit numbers, they 
chose between a set of strategies including 
retrieving the answer from memory, decom- 
posing the numbers (e.g.,3 +5 =4+4=8), 
counting both sets (counting right through 
a set of three and a set of five, perhaps using 
fingers), and the min strategy of counting on 
from the top number in the larger set (e.g., 
5,6, 7,8,s03 +5 =8). 

Siegler and Shrager’s early strategy choice 
model (1984) was based on distribution of 
associations. The idea is that each addition 
sum is associated with answers of varying 
strengths, and so for a given sample of chil- 
dren, 2 +1 might yield the answer “3” 80% 
of the time; “1” or “2,” 4%; “4,” 3%; and so 
on. The chance of an answer being chosen is 
a function of its associative strength relative 
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distribution, the more likely it will be that 
a single answer will occur. However, it will 
be adopted only if it is above the confidence 
criterion. If not, alternative strategies, such 
as counting, are sought. 

In their later work, Siegler and his col- 
laborators developed the Adaptive Strategy 
Choice Model (ASCM, pronounced “Ask- 
em”) which makes more active strategy 
choices. At the beginning, ASCM knows 
only the small set of strategies typically used 
by 4-year-olds, but it has general cognitive 
skills for choosing and evaluating strategies. 
The model is trained on a set of elementary 
addition facts; then the min strategy is added 
to the model’s repertoire. This entails count- 
ing on from the larger number to be added, 
so if the sum is 5 +3, the procedure is to 
count 5, 6, 7, 8. The model chooses a strat- 
egy for each problem on the basis of the past 
speed and accuracy of the strategy and on 
similarity between the current problem and 
past problems in which a strategy has been 
used. Each time a strategy is used, the record 
of its success is updated, and the projected 
strength of the strategy for that problem is 
calculated. The strength of association be- 
tween a problem and a specific answer is in- 
creased or decreased as determined by the 
success of the answer. One of the strengths 
of the model is that it can account for vari- 
ability both between children and between 
different strategies used by the same child 
for a particular class of problems. Most im- 
portantly, it provides a reasonably accurate 
account of strategy development in children 
as they age. 


Complexity 


Children become capable of more complex 
reasoning with age, and it is therefore im- 
portant to have some way of comparing the 
complexities of reasoning tasks. A concep- 
tual complexity theory and accompanying 
metric that discriminate tasks of different 
difficulty and explain why they differ is es- 
sential to understanding cognitive develop- 
ment. It is also necessary to define equiv- 
alence in cognitive tasks. In the past, tasks 


if they require the same knowledge domain 
and are similar methodologically, or if they 
have similar difficulties on a psychometric 
scale. Although these criteria have great util- 
ity, they have not led to an understanding of 
factors that underlie complexity, nor do they 
explain why tasks that differ in content or 
procedure can be of equivalent complexity 
whereas tasks that are superficially similar 
can be very different in complexity. With- 
out a means of assigning cognitive tasks to 
equivalence classes with common properties 
and relating tasks in different classes to each 
other in an orderly way, psychology is in a 
position similar to that of chemistry with- 
out the periodic table (Frye & Zelazo, 1998). 
Two metrics for cognitive complexity have 
been developed in the past decade. 


COGNITIVE COMPLEXITY AND CONTROL 
(CCC) THEORY 

Frye, Zelazo, & Palfai (1995; Zelazo & Frye, 
1998) analyze complexity according to the 
number of hierarchical levels of rules re- 
quired for the task. A simple task entails 
rules that link an antecedent to a conse- 
quent, a > c, whereas complex tasks have 
rules that are embedded in a higher-order 
rule that modifies the lower level rules; thus, 
another level is added to the hierarchy. The 
dimensional change card sort task has been 
a fruitful implementation of this theory. In 
a simple sorting task, a green circle might 
be assigned to the green category and a red 
triangle to the red category, where cate- 
gories are indicated by templates comprising 
a green triangle and a red circle. In a com- 
plex task, sorting depends on whether the 
higher order rule specifies sorting by color, 
as just mentioned, or by shape. If sorting is 
by shape, the green circle is sorted with the 
red circle, and the red triangle is sorted with 
the green triangle. Normative data (e.g., Ze- 
lazo & Frye, 1998; Zelazo & Jacques, 1996) 
indicate that children typically process a sin- 
gle rule by two years of age, a pair of rules 
by three years, and a pair of rules embed- 
ded under a higher order rule by four years. 
The dimensional change card sort task has 
been a useful predictor of other cognitive 
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Zelazo, & Palfai, 1995). 


THE RELATIONAL COMPLEXITY (RC) METRIC 


Halford (Halford, Wilson, & Phillips, 1998) 
defines complexity as a function of the num- 
ber of variables that can be related in a single 
cognitive representation. This corresponds 
to the arity, or number of arguments (slots) 
of a relation (an n-ary relation is a set of 
points in n-dimensional space). Normative 
data indicate that quaternary relations (four 
related variables) are the most complex that 
can be processed in parallel by most humans, 
although a minority can process quinary re- 
lations under optimal conditions. Children 
can process unary relations at one year, bi- 
nary relations at two years, ternary relations 
at five years, and the adult level is reached at 
11 years (median ages). 

Complex tasks are segmented into compo- 
nents that do not overload capacity to pro- 
cess information in parallel. However, rela- 
tions between variables in different segments 
become inaccessible (just as a three-way in- 
teraction would be inaccessible if two-way 
analyses were performed). Processing loads 
can also be reduced by conceptual chunking, 
which is equivalent to compressing variables 
(analogous to collapsing factors in a multi- 
variate experimental design). For example, 
velocity = distance/time but can be recoded 
to a binding between a variable and a con- 
stant (e.g., speed = 80 kph) (Halford, Wil- 
son, & Phillips, 1998. Section 3.4.1). Con- 
ceptual chunking reduces processing load, 
but chunked relations become inaccessible 
(e.g., if we think of velocity as a single vari- 
able, we cannot determine what happens to 
velocity if we travel the same distance in half 
the time). Complexity analyses are based on 
the principle that variables can be chunked 
or segmented only if relations between them do 
not need to be processed. Tasks that impose 
high loads are those in which chunking and 
segmentation are constrained. 

CCC and RC theories have some com- 
mon ground, but whereas CCC attributes 
complexity to the number of levels of a hi- 
erarchy, RC attributes it to the number of 


fore RC theory is directly applicable both to 
hierarchical and nonhierarchical tasks. Also, 
the principles of segmentation and concep- 
tual chunking imply that difficult tasks are 
those that cannot be decomposed into sim- 
pler tasks. In the sorting task discussed with 
respect to CCC theory, it is necessary to keep 
in mind that we are sorting by color in order 
to determine that the green circle is sorted 
with the green triangle. This means the task 
cannot be decomposed into two subtasks 
that are performed independently, because 
the conflicting dimension is always present. 

Andrews and Halford (2002) showed that 
with four- to eight-year-old children, in the 
domains of transitivity, hierarchical classi- 
fication, cardinality, comprehension of rel- 
ative clause sentences, hypothesis testing, 
and class inclusion, a single relational com- 
plexity factor accounted for approximately 
50% of variance and factor scores correlated 
with fluid intelligence (r = .79) and working 
memory (7 = .66). 


Increased Dimensionality 


Taking account of extra dimensions is a fun- 
damental requirement for cognitive devel- 
opment. For example, the progression from 
an undifferentiated concept of heat to a con- 
cept that distinguishes heat and temperature 
entails taking account of the dimensions of 
mass and specific heat: Heat = temperat- 
ure x specific heat x mass. Similarly, the dis- 
tinction between weight and density de- 
pends on taking into account volume and 
specific gravity: Weight = specific gravity x 
volume. Taking account of the extra dimen- 
sions enables children to progress from un- 
differentiated concepts of heat or weight to 
more sophisticated concepts that recognize 
the distinction between heat and tempera- 
ture or between weight and specific gravity. 
Thus, they become capable of recognizing 
that a piece of aluminium weighs less than 
a similar volume of lead, but a sufficiently 
large piece of aluminium can weight more 
than a piece of lead. Arguably, the progres- 
sion that children make here parallels the de- 
velopment of these concepts in the history of 
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tirely different context, acquisition of con- 
servation of continuous quantity arguably 
entails taking account of height and width 
of containers, rather than fixating on height 
alone (Piaget, 1950). The essential point 
here is that cognitive representations must 
include sufficient dimensions to take ac- 
count of the variations in a phenomenon, 
so children must represent volume and spe- 
cific gravity to take account of variations in 
weight, and so on. 

The importance of cascade correlation 
models, considered earlier, is that they offer 
a possible mechanism by which extra dimen- 
sions can be added to cognitive representa- 
tions to take account of variations in the task. 
The model does not have to be told what di- 
mensions to include. It creates dimensions 
in its own representations, contained in the 
hidden units, as required for input-output 
functions on which it is trained. This can be 
seen as modeling the increased dimension- 
ality of children’s cognitive representations 
as they learn to predict variations in the en- 
vironment. This mechanism illustrates the 
potential for neural nets to provide the long- 
hoped-for basis of constructivism without 
postulating that all the dimensions children 
attend to are innately determined (Elman 
et al., 1996, but see also Marcus, 1998a, 
1998b). Mareschal & Shultz (1996) suggest 
that cascade correlation models can provide 
a way to increase the computational power 
of a system, thereby overcoming a criticism 
by Fodor (1980) of constructivist models of 
cognitive development. 


Knowledge and Expertise 


The theories considered so far have placed 
major emphasis on development of reason- 
ing processes, but acquisition and organiza- 
tion of knowledge is equally important. Fur- 
thermore, knowledge acquisition interacts 
with development of reasoning processes to 
determine how effectively children can rea- 
son and solve problems. 

Several important lines of research have 
recognized acquisition of knowledge as 
a major factor in cognitive development 


Keil, 1991). Cognitive development can also 
be seen as analogous to acquisition of ex- 
pertise, so the reasoning of young children 
is analogous to that of the novice in a do- 
main. The effect of domain expertise on 
even the most basic cognitive functions was 
demonstrated by Chi (1976), who showed 
that child chess experts outperformed adult 
chess novices on a simple recall test of chess 
pieces on a board. On recall of digits, the 
children performed according to age norms, 
and well below the level of adults. This 
experiment cannot be interpreted validly 
as showing that memory capacity does not 
change with age, because capacity is not 
measured and the experiment is quite con- 
sistent with an increase in capacity with age 
that is overridden by differences in domain 
expertise. The capacity question requires 
quite a different methodology. However, the 
study does show how powerful effects of 
domain knowledge can be. Carey (1991) ar- 
gues that differentiation of heat and mass 
by young children is similarly attributable to 
knowledge acquisition. There is, of course, 
no logical reason to assume that explanations 
based on knowledge acquisition are neces- 
sarily incompatible with explanations based 
on growth in capacity. Most of the evidence 
suggests an interaction of these processes. 

Although not in the mainstream of 
knowledge research in cognitive develop- 
ment, Halford and Wilson (1980) and Hal- 
ford, Bain et al. (1998) investigated possible 
mechanisms for acquisition of structured 
knowledge along lines similar to the in- 
duction theory proposed by Holland et al. 
(1986). See also a special issue of Human 
Development (Kuhn, 1995) on reconceptual- 
izing the intersection between development 
and learning. 

Advances in our understanding of chil- 
dren’s knowledge have had a pervasive influ- 
ence on research in the field, and it would be 
hard to think of a domain that has not been 
touched by it. In this review, knowledge is 
considered in relation to children’s exper- 
tise in specific domains, including conser- 
vation, transitivity, classification, prototype 
formation, theory of mind, and scientific and 
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Domain Specificity versus Generality 


The view that cognitive processes are 
domain-specific rather than domain-general 
has developed in parallel with knowledge 
acquisition theories of cognitive develop- 
ment and has been reinforced by Fodor’s 
proposal (1983) that many cognitive pro- 
cesses are performed by specialized mod- 
ules. For example, it has been proposed 
that conditional reasoning (i.e., reasoning 
in which the major premise has the form 
‘if-then”) might depend on a module for 
cheater detection (Cheng & Holyoak, 1989; 
Cosmides & Tooby, 1992; but see Cosmides 
& Tooby, 1989), that understanding mathe- 
matics might depend on innate enumeration 
processes (Gelman, 1991), or that reasoning 
about cause might be facilitated by a mod- 
ule for processing causal information (Leslie 
& Keeble, 1987). One achievement has been 
to show that young children understand 
the distinction between artifacts and natu- 
ral kinds (Keil, 1991) and have considerable 
knowledge of basic facts about the world. 
For example, they understand that animals 
move autonomously, have blood, and can 
die (Gelman, 1990; Keil, 1995). The dis- 
tinction between animate and inanimate ob- 
jects even seems to be appreciated in infancy 
(Gergely et al., 1995). One result of these 
developments has been an increasing bio- 
logical perspective in theories of children’s 
reasoning (Kenrick, 2001). Domain-specific 
knowledge must now be seen as having a ma- 
jor influence on the developing cognitions of 
children, but it does not displace domain- 
general knowledge entirely. Basic cognitive 
operations such as memory retrieval, and 
basic reasoning mechanisms such as anal- 
ogy and means-end analysis, are applicable 
across domains. Furthermore, some higher 
reasoning processes such as transitive infer- 
ence and classification are found to corre- 
spond across domains (Andrews & Halford, 
2002). Theories such as that of Case (1985; 
Case et al., 1996) recognize the importance 


processes. 


Reasoning Processes 


Piaget based his theory of cognitive devel- 
opment on the child’s progression through 
increasingly complex logics, but this ap- 
proach has not been generally successful 
as a way of modeling children’s reasoning 
(Halford, 1993; Osherson, 1974). Consider- 
able success has been achieved in account- 
ing for adult reasoning using mental models 
(Johnson-Laird & Byrne, 1991), analogies 
(Gentner & Markman, 1997; Hofstadter, 
2001; Holyoak & Hummel, 2001; Holyoak 
& Thagard, 1995), schemas (Cheng et al., 
1986), and heuristics (Kahneman, Slovic, & 
Tversky, 1982). 

Analogical reasoning is reviewed by 
Holyoak (Chap. 6), but the implications for 
understanding cognitive development are 
considered here. An analogy is a structure- 
preserving map from a base or source to 
a target (Gentner, 1983; Holyoak & Tha- 
gard, 1989). The map is validated by struc- 
tural correspondence rather than similar 
elements. Structural correspondence is de- 
fined by two principles; uniqueness of map- 
ping implies that an element in the base 
is mapped to one and only one element 
in the target; symbol-argument consistency 
implies that if a relation symbol r in one 
structure is mapped to the relation symbol 
r’ in the other structure, the arguments 
of r are mapped to the arguments of r’ 
and vice versa. These principles operate as 
soft constraints and can be violated in small 
parts of the mapping if the overall mapping 
conforms to the criteria. Success in map- 
ping depends on representation of the corre- 
sponding relations in the two structures and 
on ability to retrieve the relevant representa- 
tions, which, in turn, depends on knowledge 
of the domain. Research on children’s ana- 
logical reasoning is reviewed by Goswami 
(1998, 2002). 

Numerous studies have assessed young 
children’s ability to perform simple propor- 
tional analogy — that is, problems of the 
form A is to B as C is to D. Brown (1989) 
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could use analogies for both learning and 
problem solving if they understood the rel- 
evant relations, were able to retrieve them 
from memory, and understood the aims of 
the task. This was borne out by Goswami 
(1989), who showed that three-, four-, and 
six-year old children could perform analo- 
gies based on relations they understood, such 
as cutting or melting (e.g., chocolate:melted 
chocolate::snowman:melted snowman). In 
a less tightly structured context, Gentner 
(1977) showed that four- to six-year-old 
children could map human body parts to 
inanimate objects such as trees (e.g., if'a tree 
had a knee it would be on the trunk a short 
distance above the ground). There appears 
to be consensus now that young children 
can perform analogies with simple relations 
if they have the relevant domain knowledge 
and if the test format is appropriate to the 
age of the children. 

Young children can also use analogies for 
problem solving (Brown, Kane, & Echols, 
1986; Crisafi & Brown, 1986; Holyoak, Junn, 
& Billman, 1984). In the study by Holyoak 
etal. (1984), children were told a story about 
a genie who transferred jewels from one 
bottle to another by rolling his magic car- 
pet into a tube and rolling the jewels down 
it. Then they were given the problem of 
transferring gumballs from one jar to an- 
other using a tube made by rolling a sheet of 
heavy paper. Even four-year-olds showed ev- 
idence of analogical reasoning. Gholson et al. 
(1996) tested children from first to fifth 
grade on transfer from missionaries and can- 
nibals problems to jealous husbands prob- 
lems, both of which require a sequence of 
moves to be selected for transferring peo- 
ple from one place to another without vi- 
olating constraints. In a second experiment 
they used similar problems that required a 
sequence of arithmetic steps to be chosen. 
The children showed evidence of analogi- 
cal transfer, based on representation of com- 
mon relations. Pauen and Wilkening (1997) 
found evidence that second- and fourth- 
grade children transferred selected aspects 
of balance scale problems to simple physi- 
cal force problems. 


tive for providing explanations of human 
reasoning (Johnson-Laird & Byrne, 1991; 
Johnson-Laird, Chap. 9). A mental model 
is more content-specific than a logical rule 
and is used by analogy. Gentner and Gen- 
tner (1983) showed that high school and 
college students could use water flowing 
in pipes as mental models of electricity, 
so pipes were mapped to conductors, con- 
strictions in pipes were mapped to resis- 
tors, water pressure to voltage, water flow 
to electric current, and reservoirs to bat- 
teries. Furthermore, reservoirs placed above 
one another were mapped into batteries in 
series, and the increase in water pressure 
was mapped to the increase in voltage, and 
so on. 

It appears that mental models are also an 
effective way of accounting for development 
of reasoning in children (Barrouillet, Gros- 
set, & Lecas, 2000; Barrouillet & Lecas, 1999; 
Halford, 1993; Markovits & Barrouillet, 
2002). Marcovits and Barrouillet (2002) 
have developed a mental models theory that 
accounts for most of the data on children’s 
conditional (if-then) reasoning. Condition- 
als may refer to classes (e.g., if X is a dog 
then X is an animal, or simply, all dogs are 
animals) or to causal relations (e.g., if it rains, 
the ground will get wet). For the problem, 
if p then q, p therefore q (modus ponens), 
construction of a mental model begins with 
the following representation: 


p q 


This represents the case in which p and 
q are both true. The model could now be 
fleshed out with other possibilities as follows 
(where =p is read as “not p”): 


p q 
7p mic 
mo q 


The second premise “p” is processed by 
selecting those components of the model 
where p is true, in this case, the first line; 
then inference is made by examining these 
cases. In this model, in the only case in which 
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“q.” For example, if the major premise were 
‘Gf an animal is a dog then it has legs,” then 
the initial model would be 


dog legs 


This could be fleshed out with alternative 
cases such as 


not dog no legs 
not dog legs 


Thus the premises are processed as rela- 
tional propositions, referring to specific in- 
stances, and are fleshed out by retrieving 
relevant information from semantic mem- 
ory. The accuracy of children’s reasoning de- 
pends on the fleshing out process, which is 
influenced by availability of relevant infor- 
mation in memory and by working memory 
capacity. The minor premise “not dog” can 
produce the fallacious inference “no legs” 
(denial of the antecedent) if the second line 
of the mental model is missing. This could 
occur if the child failed to retrieve any cases 
of things that are not dogs but have legs. 
Similarly, the minor premise “legs” can pro- 
duce the fallacious inference “dog” (affirma- 
tion of the consequent). Markovits (2000) 
has shown that children are more likely to 
recognize that these inferences are not justi- 
fied if they can readily generate the alterna- 
tive cases. In the aforementioned example, it 
is easy to generate instances of things that are 
not dogs but have legs. In a problem such as 
‘Gf something is a cactus then it has thorns,” 
generation of alternative cases is more diffi- 
cult, and children are less likely to recognize 
the fallacies. The second major factor is pro- 
cessing (working memory) capacity. More 
complex problems entail representation of 
more relations. The example just given ef- 
fectively entails three relations correspond- 
ing to the three lines of the mental model. 
A simpler problem would consist of only 
the first and second lines and corresponds 
to a biconditional interpretation of the ma- 
jor premise. This representation is simpler. 
Increase in effective capacity with age en- 
ables children to reason correctly on more 
complex problems. The model of Marcovits 


content effects (Barrouillet & Lecas, 1998; 
Leevers & Harris, 1999) and complexity ef- 
fects (Halford, Wilson, & Phillips, 1998). 

Other studies of children’s conditional 
reasoning have utilized the Wason selection 
task (Wason, 1966; see also Evans, Chap. 8). 
The task entails four cards containing (say) 
an A, B, 4, and 7, and participants are told 
that there is a letter on one side of each 
card and a number on the other. They are 
asked which cards must be turned over to 
test the proposition that if there is an A on 
one side there must be a 4 on the other. 
The correct answer, cards containing the A 
and 7, is rare even among adults. There are 
well-known content effects, and it has been 
shown that versions of the task based on per- 
mission (Cheng et al., 1986) or cheater de- 
tection (Cosmides & Tooby, 1992) are per- 
formed better. Similar improvements have 
been observed in children (Cummins, 1996; 
Light et al., 1989). 

The literature supports the claim that 
conditional reasoning is possible for chil- 
dren, even as young as four, and improve- 
ments can be produced by more appro- 
priate task presentation (Markovits et al., 
1996) and by experience, but consider 
able development occurs throughout child- 
hood (Muller, Overton, & Reene, 2001). 
Among the relatively late-developing com- 
petences are understanding of logical ne- 
cessity (Falmagne, Mawby, & Pea, 1989; 
Kuhn, 1977; Morris & Sloutsky, 2002; Os- 
herson & Markman, 1975) and reasoning 
that requires representation of complex 
relations. 


Elementary Concepts 


Conservation, transitivity, and classification 
are three concepts that have long been con- 
sidered fundamental to children’s reasoning. 
All have been controversial because Piaget's 
claim that they are concrete operational and 
unattainable before seven to eight years has 
been contested by many researchers. We 
briefly consider each in turn. 
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Perhaps the most widely researched concept 
in the field, conservation, is still not well 
understood. The Q-SOAR model of conser- 
vation acquisition was briefly reviewed ear- 
lier, and other models exist (Caroff, 2002; 
Halford, 1970; Shultz, 1998; Siegler, 1995). 
However, the issues that have received most 
attention in the literature concern how con- 
servation should be measured and the age 
at which children master it. A number 
of authors have argued that the Piagetian 
tests misled children and therefore under 
estimated their understanding. A common 
cause of the alleged misunderstanding is 
that an increase in the length of a row of 
objects (or an increase in the height of a 
column of liquid) makes the number (or 
amount) appear greater. This received sub- 
stantial support from a study by Gelman 
(1969), who used an oddity training pro- 
cedure to induce five-year-old children to 
attend to number rather than length and 
showed that they conserved number. This 
interpretation received further support from 
McGarrigle and Donaldson (1975), who im- 
proved conservation in children aged four 
to six years by having a “naughty teddy” 
perform the transformation, thereby mak- 
ing it accidental and removing any sugges- 
tion that an increase in amount was in- 
tended. These studies, like a host of others, 
showed improved performance in children 
about five to six years of age. However, 
Bryant (1972) eliminated length cues and 
showed that three- and four-year-old chil- 
dren carried a pretransformation judgment 
over into the posttransformation situation. 
However, his claim that this demonstrated 
conservation was disputed by Halford and 
Boyle (1985). Sophian (i995) also failed to 
replicate Bryant’s finding of early conserva- 
tion and showed that conservation was re- 
lated to understanding of counting, suggest- 
ing that conservation reflects some aspects 
of children’s quantitative concepts. From ex- 
tensive reviews of the conservation literature 
(Halford, 1982, 1989), it seems there is clear 
evidence of conservation at approximately 
five years of age, which is earlier than Piaget 


Bryant (1972). 
Transitivity and Serial Order 


A transitive inference has the form: aRb, 
bRc implies aRc, if R is a transitive rela- 
tion. For example, a > b, b > c implies a > 
c. Piaget’s claim of late attainment was chal- 
lenged by Bryant and Trabasso (1971), who 
trained three- to six-year-old children to 
remember the relative lengths of adjacent 
sticks in a series (e.g.,a <b b<agc<d, 
d < e). Then they were tested on all possi- 
ble pairs. The crucial pair is b?d because this 
was not learned during training and must be 
inferred from b < c, c < d. Also, the bd pair 
avoids the end elements, which tend to be 
labeled as small (a) or large (e). Bryant and 
Trabasso found that three- and four-year- 
old children performed above chance on the 
bd pair, suggesting that they made a tran- 
sitive inference. Riley and Trabasso (1974) 
showed that both children and adults per- 
formed the task by ordering the elements — 
that is, they formed the ordered set a,b,c,d,e. 
This in itself does not affect the validity 
of the test because an asymmetric, transi- 
tive binary relation is a defining property 
of an ordered set, so the children presum- 
ably utilized transitivity in some way while 
ordering the elements. The problem, how- 
ever, was that, to facilitate acquisition, the 
premise pairs were presented initially in as- 
cending or descending order (a < b, b < c, 
c < d,d<e, or the reverse). This clearly 
gave children undue help in ordering the 
elements. Furthermore, children who failed 
to learn the premise pairs were eliminated, 
and elimination rates were as high as 50% in 
some experiments. The problem with this 
is that children might have failed to learn 
the premises because they could not deter- 
mine the correct order, which, in turn, might 
reflect lack of understanding of transitivity. 
When Kallio (1982) and Halford and Kelly 
(1984) eliminated these extraneous sources 
of help, success was not observed below five 
years of age. Subsequent research (Andrews 
& Halford, 1998; Pears & Bryant, 1990) 
has confirmed that transitive inference is 
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olds, and the median age of attainment is 
about five years. For more extensive reviews 
and theoretical discussions, see Brainerd & 
Reyna (1993), Breslow (1981), Thayer & 
Collyer (1978), and Halford (1982, 1993). 

A derivative of the Bryant and Trabasso 
(1971) paradigm, transitivity of choice, has 
found wide use in animal studies (Boysen 
et al., 1993; Chalmers & McGonigle, 1984; 
von Fersen et al., 1991; McGonigle & 
Chalmers, 1977; Terrace & McGonigle, 
1994). Participants are trained to choose one 
member of each pair in a series. For example, 
they are rewarded for choosing A in prefer- 
ence to B, B in preference to C, C in prefer- 
ence to D, and D in preference to E. Tran- 
sitivity of choice is indicated by choice of B 
in preference to D. However, whereas tran- 
sitive inference implies an ordinal scale of 
premise elements, in transitivity of choice 
there is no such scale (Markovits & Du- 
mas, 1992). Furthermore, whereas the transi- 
tive inference task is performed dynamically 
in working memory, following a single pre- 
sentation of premises, the premise pairs in 
transitivity of choice are learned incremen- 
tally over many trials, and the task can be 
performed by associative processes (Wynne, 
1995). Although both paradigms are impor- 
tant, transitive inference and transitivity of 
choice should not be regarded as equivalent 
tests of the transitivity concept. 


Classification 


Concepts and categories are reviewed by 
Medin and Rips (Chap. 3). Developmen- 
tally, categorization appears to progress from 
prototypes, arguably the most basic form of 
categorization, to more advanced categories, 
including those based on rules or theories. 
All advanced categories appear to have a la- 
bel or symbol (e.g., “dog” for the dog cate- 
gory) so it will be convenient to deal with 
them under the heading of symbolic cate- 
gories. 


PROTOTYPE MODELS OF CLASSIFICATION 


There is evidence that infants can form pro- 
totypic categories (Rosch & Mervis, 1975; 


a set of objects with similar features, such as 
animals (dogs, cats), form categories (Quinn, 
2002). They are also sensitive to the correla- 
tion between attributes (Younger & Fearing, 
1999). However, prototypes are arguably 
subsymbolic because they are well simulated 
by three-layered nets (Quinn & Johnson, 
1997) and do not have properties such as 
compositionality that are basic to symbolic 
processes (Fodor, 1995; Halford, Phillips, & 
Wilson, unpublished manuscript). Mandler 
(2000) argues that infants make the transi- 
tion from perceptual categories, which en- 
able objects to be recognized by their ap- 
pearance, to conceptual categories, defined 
by the role objects play in events and that 
serve as a basis for inductive inference. 

Neural nets are a very suitable basis for 
constructing models of prototype forma- 
tion, and McClelland and Rumelhart (1985) 
produced an early prototype model that fun- 
damentally changed the way we view cate- 
gorization. Quinn and Johnson (1997) devel- 
oped a three-layered net model of prototype 
formation in infants. There were thirteen in- 
put nodes that encoded attributes of picto- 
rial instances of four animals (cats, dogs, ele- 
phants, rabbits) and four kinds of furniture 
(beds, chairs, dressers, tables). There were 
three hidden units and ten output units, two 
of which coded for category (animals, furni- 
ture) and the remainder coded for the eight 
instances. After the net was trained to rec- 
ognize categories and instances, the repre- 
sentations in the hidden units were exam- 
ined. Initially there was no differentiation, 
then the units differentiated mammals and 
furniture, then instances were distinguished 
within the categories. The study is important 
for showing how categories can be formed 
by a learning algorithm. As with the mod- 
els of McClelland (1995) and Elman (1990) 
discussed earlier, the representations emerge 
from the learning process. 


SYMBOLIC CATEGORIES 


Even young children form categories based 
on, and draw inductive inferences about, es- 
sential or nonobvious properties (Gelman, 
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rized on the basis of hidden properties that 
cause their surface or observable features, so 
animals contain essential biological material 
that enable them to move, eat, make charac- 
teristic sounds, and reproduce animals of the 
same kind. Categorization by essential prop- 
erties is sometimes interpreted as evidence 
that people have theories about the domain, 
such as theories about the nature of animals, 
although there are also interpretations based 
on causal laws rather than essences (Re- 
hder & Hastie, 2001; Strevens, 2000). There 
is strong evidence that young children can 
make inductive inferences based on category 
membership. Gelman and Markman (1986) 
presented four-year-olds with a picture of a 
bird, told the children a property of the bird 
(feeds its young with mashed up food) and 
found that the children attributed the prop- 
erty to other, even dissimilar, birds but not 
to a different category such as bats. Young 
children appear to generalize even nonob- 
servable properties on the basis of category 
membership, independent of appearance. 
The ease with which young children 
make inductive inferences about categories 
contrasts with the difficulty they have in rea- 
soning about hierarchically structured cate- 
gories. For example, given twelve apples and 
three oranges, when asked whether there are 
more apples or more fruit, they tend to say 
there are more apples. This task is a deriva- 
tive of class inclusion items originally used by 
Inhelder & Piaget (1964). The Piagetian hy- 
pothesis was that children lacked a concept 
of inclusion till they reached the concrete 
operations stage, but many alternative hy- 
potheses have been proposed (McGarrigle, 
Grieve, & Hughes, 1978; Siegel et al., 1978; 
Winer, 1980). Misinterpretation of the ques- 
tion is the common feature in these pro- 
posals. That is, children interpret “more ap- 
ples or more fruit” to mean “more apples 
or more other kinds of fruit,” and because 
there are only three pieces of non-apple fruit 
(oranges), they say there are more apples. 
On the other hand Halford (1989) argued 
that many of the improved performances 
produced by alternative tests were no bet- 
ter than chance, or were amenable to al- 


niques for estimating the number of answers 
attributable to misinterpretation or guess- 
ing have been developed (Hodkin, 1987; 
Thomas & Horton, 1997). 

Alternative assessments have been de- 
vised, one based on a sorting task that was 
isomorphic to class inclusion but did not in- 
clude potentially misleading questions (Hal- 
ford & Leitch, i989) and one based on 
property inference (Greene, 1994; Johnson, 
Scott, & Mervis, 1997). Understanding class 
inclusion entails recognition of the asym- 
metric relation between categories at dif- 
ferent levels of the hierarchy. For example, 
properties of fruit apply to apples, but the re- 
verse is not necessarily true, because apples 
may have properties not shared with other 
fruit. Halford, Andrews, & Jensen (2002) 
assessed category induction and class in- 
clusion by equivalent methods, based on 
property inference. Relational complexity 
analysis showed that category induction is 
binary relational, because it entails a com- 
parison of a class with its complement 
(e.g., birds and non-birds). Class inclusion is 
ternary relational because it necessarily en- 
tails an inclusive class (e.g., fruit), a sub- 
class (e.g., apples), and a complementary 
subclass (non-apple fruit). When assessed 
by equivalent methods, class inclusion was 
found to be more difficult and performance 
on it was predicted by ternary relational 
tasks from other domains. This suggests that 
category induction and class inclusion are 
really the same paradigm at two levels of 
complexity. 

Conservation, transitivity, and class inclu- 
sion are all ternary relational (Andrews & 
Halford, 1998; Andrews & Halford, 2002; 
Halford, Wilson, & Phillips, 1998) and this 
level of complexity is attainable by ap- 
proximately twenty percent of four-year 
olds, 50% of five-year-olds, 70% of 6-year- 
olds, and 78% of seven- and eight-year-olds. 
There is no age at which children sud- 
denly attain all these concepts, as implied 
by some interpretations of Piagetian stage 
theory. Rather, the proportion of children 
who succeed increases according to a bio- 
logical growth function. We can conclude 
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years. 


Concept of Mind 


Children’s ability to understand other peo- 
ple’s mental states has been one of the 
most intensively researched topics in the 
past two decades. See Astington (1993) or 
Wellman, Cross, and Watson (2001) for re- 
views. Two main types of tasks have been 
employed — appearance-reality and false be- 
lief. Appearance-reality is tested by present- 
ing children with an object that appears to be 
something else and asking them what it re- 
ally is and what it appears to be. For example, 
Flavell, Green, and Flavell (1986) showed 
children a small white fish and then covered 
it with a blue filter and asked what color it 
was really and what color it appeared to their 
eyes. Children below about four years have 
difficulty recognizing both that the object is 
really white and that it appears blue. In a 
typical false-belief task (Wimmer & Perner, 
1983), Person 1 hides an object in a box and 
leaves the room, then Person 2 shifts the ob- 
ject to a basket, and then Person 1 returns. 
Before age four, children have difficulty rec- 
ognizing that Person 1 will look for the object 
in the box because he or she did not see it 
moved to the basket. 

Numerous factors have been shown to 
influence children’s concept of mind, in- 
cluding social-perceptual knowledge (Tager- 
Flusberg & Sullivan, 2000), understanding 
of mental states (Bretherton & Beeghly, 
1982), and language (Astington & Jenkins, 
1999). Astington & Gopnik (1991) proposed 
a theory-theory, meaning that children’s 
concepts of belief, desire, and pretence 
are linked in an explanatory framework. 
The more neutral term “concept of mind” 
is used here, even though “theory of mind” 
is in common use, because there are still 
doubts that children’s understanding of the 
mind amounts to a theory. For example, 
telling children that people’s thoughts can be 
wrong or reminding them of their own false 
beliefs does not raise three-year-olds’ per- 
formance above chance. Leslie (1987) pro- 
posed an innate theory of mind mechanism, 


social cues indicating mood, interest, or at- 
tention. 

There is growing evidence that concept of 
mind is related to executive function (Carl- 
son & Moses, 2001; Perner, Lang, & Kloo, 
2002) and is partly a function of ability 
to deal with the appropriate level of com- 
plexity. Halford (1993; Halford, Wilson, & 
Phillips, 1998) analyzed the complexity of 
concept of mind tasks and showed it en- 
tails integrating three variables — the envi- 
ronmental cue, the setting condition, and the 
person’s representation. Appearance-reality 
requires processing the relation between ob- 
ject color (white), the color of the filter 
(blue), and the percept (white or blue). Evi- 
dence that complexity is a factor in concept 
of mind has been produced by several groups 
of researchers (Andrews et al., 2003; Davis & 
Pratt, 1995; Frye, Zelazo, & Palfai, 1995; Gor- 
don & Olson, 1998; Halford, 1993; Keenan, 
Olson, & Marini, 1998). 

The analysis showing that concept of 
mind requires processing ternary relations 
suggests this should not be possible for chim- 
panzees because the most complex relation 
they have been shown to process is binary 
(Halford, Wilson, & Phillips, 1998). Al- 
though the issue has been controversial, a 
well-controlled study by Call and Tomasello 
(1999) tends to support this prediction. (See 
also Tomasello & Call, Chap. 25.) 


Scientific Thinking 


The topics reviewed so far on children’s un- 
derstanding of conservation, transitive infer- 
ence, serial order, classification, cause, and 
biological processes are all important to the 
development of scientific and mathemati- 
cal thinking. In this section, we consider 
some of the more advanced forms of sci- 
entific and mathematical reasoning in chil- 
dren. (See also Chap. 5 on causal reasoning 
by Buehner & Cheng, Chap. 23 on mathe- 
matical thinking by Gallistel & Gelman, and 
Chap. 29 on reasoning in science by Dunbar 
& Fugelsang.) 

Whether children think as young scien- 
tists has been a major question of interest 
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rization and concept of mind are driven 
by theories of the domain, as noted earlier. 
Kuhn et al. (1988) investigated how children 
assessed evidence in order to test hypothe- 
ses, and concluded that there was a con- 
siderable lack of scientific objectivity, espe- 
cially among the younger children. Similarly, 
Klahr, Fay, and Dunbar (1993) found strong 
developmental effects in a study of ability to 
design experiments to determine rules un- 
derlying operation of a robot. On the other 
hand, Ruffman et al. (1993) found evidence 
that six-year-olds have some understanding 
of how covariation evidence has implications 
for hypotheses about factors responsible for 
an event. There is also evidence that chil- 
dren as young as five recognize the eviden- 
tial diversity principle — that we can be more 
confident of a induction from a set of diverse 
premises than from a set of similar premises 
(Heit & Hahn, 2001; Lo etal., 2002; Sloman, 
Chap. 3, this volume). A theoretical account 
of the development of inductive reasoning is 
given by Kuhn (2001), and a review of the 
development of scientific reasoning skills is 
provided by Zimmerman (2000). 


Time, Speed, Distance, and Area 


Understanding time, speed, and distance is 
interesting because it entails relations among 
three variables; speed = distance — time™?; 
and this relation should be accessible from 
other directions, so distance = speed x time, 
and so on. Matsuda (2001) found a pro- 
gression from considering relations between 
two variables (e.g., between duration and 
distance or between distance and speed) 
at four years to integration of all three di- 
mensions by age 11. Wilkening (1980) used 
an information integration theory approach 
in which the variance in children’s judg- 
ments of distance was assessed as a function 
of speed and duration. In the information 
integration approach, reliance on a factor 
is indicated by a main effect, and reliance 
on the product of speed and duration is in- 
dicated by an interaction of these factors. 
Integration by an additive rule, speed + 
distance, is indicated by two main effects. 


showed evidence of the multiplicative rule. 
A similar assessment of children’s under- 
standing of area indicated gradual progres- 
sion from additive rule (area = length + 
width) to a multiplicative rule (area = 
length x width) by adulthood. A cascade 
correlation model of time, distance, and ve- 
locity judgments is provided by Buckingham 
and Shultz (2000). 


Causal Reasoning 


Infants are able to perceive causal links be- 
tween entities (Leslie & Keeble, 1987), but 
the causal reasoning of older children seems 
to be influenced by complexity (Brooks, 
Hanauer, & Frye, 2001; Frye et al., 1996) 
or by concept availability (Ackerman, Silver, 
& Glickman, 1990). The explanation may 
be that, as Leslie and Keeble (1987) sug- 
gest, the causal recognitions of infants are 
based on a modular process that is essen- 
tially perceptual. Modular processes are not 
typically influenced by cognitive complex- 
ity. The causal reasoning of older children 
probably depends on more conceptual or 
symbolic processes (Schlottman, 2001). 


Balance Scale 


Siegler (1981) applied the rule assessment 
approach to children’s performance on the 
balance scale, yielding the four rules that 
were discussed in connection with McClel- 
land’s neural model (1995). Siegler’s data 
showed Rule I (judgments based solely on 
weight) was used by five-year-olds, and 
they could also be taught to use Rule II 
(distance considered if weights are equal). 
Surber and Gzesh (1984) used an informa- 
tion integration approach and found that 
five-year-olds tended to favor the distance 
rule. Case (1985), Marini (1984), and Jansen 
and van der Maas (1997) generally supported 
Siegler’s findings and saw little understand- 
ing of the balance scale before age five. 
Relational complexity theory (Halford, 
1993; Halford, Wilson, & Phillips, 1998) pro- 
poses that discrimination of weights with 
distance constant, or distances with weight 
constant, entails processing binary relations 
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This prediction was contrary to previous the- 
ory (Case, 1985) and empirical observation 
(Siegler, 1981). Integration of weight and 
distance requires at least ternary relations 
and should emerge along with other ternary 
relational concepts at age five years. These 
predictions were confirmed by Halford 
et al. (2002). 


Concept of the Earth 


The development of children’s concept of 
the earth (Hayes et al., 2003; Samarapunga- 
van, Vosniadou, & Brewer, 1996; Vosniadou 
& Brewer, 1992) has special interest because 
it entails a conflict between the culturally 
transmitted conception that the Earth is a 
sphere and everyday experience that tends 
to make it appear flat. Resolution of this 
conflict entails recognition that the huge cir 
cumference of the earth makes it appear flat 
from the surface. There is also a conflict be- 
tween gravity, naively considered as making 
objects fall down, and the notion that people 
can stand anywhere on the Earth’s surface, 
including the southern hemisphere, which 
is conventionally regarded as “down under.” 
This can be resolved by a concept of gravity 
as attraction between two masses, the Earth 
and the body (person) on the surface, but 
there is little basis for this concept in ev- 
eryday life. The development of children’s 
concept of the Earth provides an interest- 
ing study in the integration of complex re- 
lations into a coherent conception. Young 
children were found to attempt resolution of 
the conflicting ideas by, for example, draw- 
ing a circular earth with a horizontal plat- 
form inside for people to stand on, or as a 
flattened sphere to provide more standing 
room at the top. Nevertheless, there was a 
clear tendency for ideas to develop toward 
coherence. 


Conclusions and Future Directions 


Although acknowledging that predictions 
of future developments are inherently 
hazardous, it seems appropriate after an 


to identify some of the more promising de- 
velopments. My bias is to look for develop- 
ments that might provide a coherent body 
of theory because this is what the field of 
cognitive development, like the rest of psy- 
chology, needs most. I have identified four 
trends I feel deserve consideration in this 
respect. 


Neuroscience and the Biological 
Perspective 


The greatly increased knowledge of neuro- 
science and the use of brain imaging as a 
converging operation to help constrain theo- 
ries of cognitive development represent ma- 
jor developments in the past two decades. 
Combined with the biological perspective, 
they do offer some hope of a coherent 
framework for viewing cognitive develop- 
mental data. The identification of changes 
in rates of neural and cognitive develop- 
ment is one example of what this field has 
achieved. 


Dynamic Systems 


Dynamic systems models have made con- 
siderable progress, and they are much more 
clearly linked to data than was the case a 
decade ago. They can provide new perspec- 
tives on important issues such as whether de- 
velopment is continuous or discontinuous or 
the fact that performance might be uneven 
across different indicators of the same task. 
The relative importance of different classes 
of observations might change fundamentally 
with this perspective. 


Transition Mechanisms 


Transition mechanisms and more advanced 
conceptions of learning have provided real 
conceptual advances and some of the most 
important empirical findings in the past 
decade. Neural net models have defined 
some potential mechanisms of concept ac- 
quisition that would almost certainly never 
have been recognized intuitively and would 
have been very difficult to discover using our 
contemporary empirical methods. 
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Analyses of underlying processes using 
methods from general cognitive psychol- 
ogy and cognitive science can help bring 
order and clarity to the field. There are 
many examples of tasks that are superficially 
similar (such as transitive inference and 
transitivity of choice) yet entail fundamen- 
tally different processes, and it only creates 
confusion to categorize them together. Cor- 
respondingly, there are tasks that are super- 
ficially very different, yet may entail un- 
derlying cognitive processes with important 
common properties. An example would be 
the corresponding difficulties of the dimen- 
sional change card sort task, and ternary re- 
lational tasks such as transitivity, class inclu- 
sion, and the concept of mind. We cannot 
order tasks for difficulty, nor discover impor- 
tant equivalences, unless we look beneath 
surface properties. Cognitive psychology has 
progressed to the point at which we can do 
this with reasonable confidence. 
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CHAPTER 23 


Mathematical Cognition 


C. R. Gallistel 
Rochel Gelman 


Mathematics is a system for representing and 
reasoning about quantities, with arithmetic 
as its foundation. Its deep interest for our un- 
derstanding the psychological foundations of 
scientific thought comes from what Eugene 
Wigner called “the unreasonable efficacy of 
mathematics in the natural sciences.” From 
a formalist perspective, arithmetic is a sym- 
bolic game, like tic-tac-toe. Its rules are more 
complicated, but not a great deal more com- 
plicated. Mathematics is the study of the 
properties of this game and of the systems 
that may be constructed on the foundation it 
provides. Why should this symbolic game be 
so powerful and resourceful when it comes 
to building models of the physical world? 
And on what psychological foundations does 
the human mastery of this game rest? 

The first question is metaphysical — why 
is the world the way it is? We do not treat 
it, because it lies beyond the realm of exper- 
imental behavioral science. We review the 
answers to the second question suggested by 
experimental research on human and non- 
human animal cognition. 

The general nature of the answer is that 
the foundations of mathematical cognition 


do not lie in language and the language fac- 
ulty. The ability to estimate quantities and 
to reason arithmetically with those estimates 
exists in the brains of animals that have no 
language. The same or very similar nonver- 
bal mechanisms appear to operate in paral- 
lel with verbal estimation and reasoning in 
adult humans. They also operate to some 
extent before children learn to speak and 
before they have had any tutoring in the el- 
ements of arithmetic. These findings suggest 
that the verbal expression of number and of 
arithmetic thinking is based on a nonverbal 
system for estimating and reasoning about 
discrete and continuous quantity, which we 
share with many nonverbal animals. A rea- 
sonable supposition is that the neural sub- 
strate for this system arose far back in the 
evolution of brains precisely because of the 
puzzle to which Wigner called attention: 
Arithmetic reasoning captures deeply im- 
portant properties of the world, which the 
animal brain must represent in order to act 
effectively in it. 

The recognition that there is a nonver- 
bal system of arithmetic reasoning in hu- 
man and many nonhuman animals is recent, 


559 


560 THE CAMBRIDGE HANDBOOK OF THINKING AND REASONING 


but it infludidéventate ben tetipssisiiitignacy@bordering, then they do not function as 


perimental work on mathematical cogni- 
tion. This review is organized around the 
questions: (1) What are the properties of 
this nonverbal system? (2) How is it related 
to the verbal system and written numeri- 
cal systems? 


What Is a Number? 


Arithmetic is one of the few domains 
of human thought that has been ex- 
tensively formalized. This formalization 
did not begin in earnest until the mid- 
dle of the nineteenth century (Boyer & 
Merzback, 1989). In the process of formal- 
izing the arithmetic foundations of mathe- 
matics, mathematicians changed their minds 
about what a number is. Before formal- 
ization, an intuitive understanding of what 
a number is determined what could legit- 
imately be done with it. Once the for 
mal “games” about number were made ex- 
plicit, anything that played by the rules was 
a number. 

This formalist viewpoint is crucial to an 
understanding of issues in the current sci- 
entific literature on mathematical cognition. 
Many of them turn on questions of how we 
are to recognize and understand the proper- 
ties of mental magnitudes. Mental magnitude 
refers to an inferred (but, one supposes, po- 
tentially observable and measurable) entity 
in the head that represents either numeros- 
ity (for example, the number of oranges in 
a case) or another magnitude (for example, 
the length, width, height, and weight of the 
case) and that has the formal properties of a 
real number. 

For a mental magnitude to represent an 
objective magnitude, it must be causally re- 
lated to that objective magnitude. It must 
also be shown that it is a player in a men- 
tal game (a functionally cohesive collection 
of brain processes) that operates accord- 
ing to at least some of the rules of arith- 
metic. When putative mental numbers do 
not validly enter into, at a minimum, men- 
tal addition, mental subtraction, and men- 


numbers. 


Kinds of Numbers 


The ancient Greeks had considerable success 
axiomatizing geometry, but mathematicians 
did not axiomatize the system of numbers 
until the nineteenth century, after it had un- 
dergone a large, historically documented ex- 
pansion. Before this expansion, it was too 
messy and incomplete to be axiomatized, 
because it lacked closure. A system of num- 
bers is closed under a combinatorial oper- 
ation if, when you apply the operation to 
any pair of numbers, the result is a number. 
Adding or multiplying two positive integers 
always produces a positive integer, so the 
positive integers are closed under addition 
and multiplication. They are also closed un- 
der the operation of ordering. For any pair of 
numbers, a > b = 1 if ais greater or equal to 
than b, and o if not. These three operations — 
addition, multiplication, and ordering — are 
the core operations of arithmetic. They and 
their inverses make the system what it is. 
The problem comes from the inverse op- 
erations of subtraction and division. When 
you subtract a bigger number from a smaller, 
the result is not a positive integer. Should 
one regard the result as a number? Un- 
til well into the nineteenth century, many 
professional mathematicians did not. Thus, 
subtracting a bigger number from a smaller 
number was not a legitimate mathematical 
operation. This was inconvenient, because it 
meant that in the course of algebraic reason- 
ing (reasoning about unspecified numbers), 
one might unwittingly do something that 
was illegitimate. This purely practical con- 
sideration strongly motivated the admission 
of the negative numbers and zero to the set 
of numbers acknowledged to be legitimate. 
When one divides one integer by an- 
other, the result, called a rational number, 
or, more colloquially, a fraction, is rarely an 
integer. From the earliest times from which 
we have written records, people who worked 
with written numbers included at least some 
rational numbers among the numbers, but, 
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traordinary difficulties in figuring out how to 
do arithmetic with rational numbers in gen- 
eral. What is the sum of 1/3 and11/17? That 
was a hard question in ancient Egypt and 
remains so today in classrooms all around 
the world. 

The common notation for a fraction spec- 
ifies a number not by giving it a unique 
name like two but rather by specifying a way 
of generating it (divide the number one by 
the number two). The practice of specify- 
ing a number by giving an arithmetic proce- 
dure that will generate it to whatever level 
of precision is required has grown stronger 
over the millenia. It is the key to a rigor- 
ous handling of both irrational and complex 
numbers and to the way in which digital 
computers operate with real numbers. But 
it is discomfiting, for several reasons. First, 
there are an infinity of different notations 
for the same number: 1/2, 2/4, 3/6, and so 
on, all specifying the same number. More- 
over, for most rational numbers, there is no 
complete decimal representation. Carrying 
out the division gives a repeating decimal. In 
short, you cannot write down a symbol for 
most rational numbers that is both complete 
and unique.' 

Finally, when fractions are allowed to be 
numbers, the discrete ordering of the num- 
bers is lost. It no longer is possible to specify 
the next number in the sequence, because 
there are an infinite number of rational num- 
bers between any two rational numbers. For 
all these reasons, admitting fractions to the 
system of numbers makes the system more 
difficult to work with in the concrete, albeit 
more powerful in the abstract, because the 
system of rational numbers is, with one ex- 
ception, closed under division. 

Allowing negative numbers and fractions 
to be numbers also creates problems with 
what otherwise seem to be sound principles 
for reasoning about numbers. For example, 
it seems to be sound to say that dividing the 
bigger of two numbers by the smaller gives a 
number that is bigger than the number one 
gets if one divides the smaller by the bigger. 
What then are we to make of the “fact” that 
1/-1=-1/1=-1? 


ing to be necessary if we want to treat as 
numbers entities that you do not get by 
counting. But, humans do want to do this, 
and they have wanted to since the beginning 
of recorded history. We measure quantities 
like lengths, weights, and volumes in order 
to represent them with numbers. What the 
measuring does — if it is done well — is give 
us “the right number” or at least one usable 
for our purposes. Measuring and the result- 
ing representation of continuous quantities 
by numbers go back to the earliest written 
records. Indeed, it is often argued that writ- 
ing evolved from a system for recording the 
results of measurements made in the course 
of commerce (bartering, buying, and sell- 
ing), political economy (taxation), survey- 
ing, and construction (Menninger, 1969). 
The ancient Greeks believed that, in prin- 
ciple, all measurable magnitudes could be 
represented by rational numbers. Everything 
was a matter of proportion, and any propor- 
tion could be expressed as the ratio of two 
integers. They were also the first to try to for- 
malize mathematical thinking. In doing so, 
they discovered, to their horror, that frac- 
tions did not suffice to represent all possible 
proportions. They discovered that the pro- 
portion between the side of a square and its 
diagonal could not be represented by a frac- 
tion. The Pythagorean formula for calculat- 
ing the diagonal of a square says that the di- 
agonal is equal to the square root of the sum 
of the squares of the sides. In this case, the 
diagonal is equal to V(1? +12) =V(i +1) = 
V2. The Greeks proved that there is no frac- 
tion that, when multiplied by itself is equal 
to 2. If only integers and fractions are num- 
bers, then the length of the diagonal of the 
unit square cannot be represented by a num- 
ber. Put another way, you can measure the 
side of the square or you can measure its di- 
agonal, but you cannot measure them both 
exactly within the same measuring system — 
unless you are willing to include among the 
numbers in that system numbers that are 
not integers (cannot be counted) and are not 
even the ratio of two integers. You must in- 
clude what the Greeks called the irrational 
numbers. But if you do include the irrational 
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them in the general case? 

Many irrationals can be specified by the 
operation of extracting roots, which is the 
inverse of the operation of raising a number 
to a power. Raising any positive integer to 
the power of any other always produces a 
positive integer. Thus, the system of positive 
integers is closed under raising to a power. 
The problem, as usual, comes from the in- 
verse operation — extracting roots. For most 
pairs of integers, a and b, the ath root of b 
is not a positive integer, nor even a rational 
number; it is an irrational number. The need 
within algebra to have an arithmetic that was 
closed under the extraction of roots was a 
powerful motivation for mathematicians to 
admit both irrational numbers and complex 
numbers to the set of numbers. By admitting 
irrational numbers, they created the system 
of so-called real numbers, which was essen- 
tial to calculus. To this day, there are pro- 
fessional mathematicians who question the 
legitimacy of irrational numbers. Nonethe- 
less, the real numbers, which include the 
irrationals (see Figure 23.1), are taken for 
granted by all but a very few contemporary 
mathematicians. 

The notion of a real number and that 
of a magnitude (for example, the length of 
a line) are formally identical. This means, 
among other things, that for every line seg- 
ment, there is a real number that uniquely 
represents the length of that segment (in 
a given system of measurement) and con- 
versely, for every real number, there is a 
line segment that represents the magnitude 
of that number. Therefore, in what follows, 
when we mention a mental magnitude, we 
mean an entity in the mind (brain) that func- 
tions within a system with the formal prop- 
erties of the real number system. Like the 
real number system, we assume that this sys- 
tem is a closed system: All of its combinato- 
rial operations, when applied to any pair of 
mental magnitudes, generate another mental 
magnitude. 

As this brief sketch indicates, the system 
of number recognized by almost all contem- 
porary professional mathematicians as “the 
number system” — the ever more inclusive 
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Figure 23.1. The number system on which 
modern mathematics is based. Not shown in this 
diagram are the algebraic numbers, which are the 
numbers that may be obtained through the 
extraction of roots (the solving of polynomial 
equations), nor the transcendental numbers, 
which may be obtained only by solving equations 
with trigonometric, exponential, or logarithmic 
terms. These are subcategories of the irrational 
numbers. 


hierarchy of kinds of numbers shown in Fig- 
ure 23.1 — has grown up over historical time 
with much of the growth culminating only 
in the preceding two centuries. The psycho- 
logical question is, “What is it in the minds 
of humans (and perhaps also nonhuman an- 
imals) that has been driving this process?” 
And how and under what circumstances 
does this mental machinery enable educated 
modern humans to master the basics of for- 
mal mathematics, when, and to the extent 
that they do so? 


Numerical Estimation and Reasoning 
in Animals 


The development of verbalized and written 
reasoning about number that culminated in 
a formalized system of real numbers isomor- 
phic to continuous magnitudes was driven 
by the fact that humans apply numerical rea- 
soning to continuous quantity just as much 
as they do to discrete quantity. In consid- 
ering the literature on numerical estimation 
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viewing the evidence that they estimate and 
reason arithmetically about the quintessen- 
tially continuous quantity time. 

Common laboratory animals, such as the 
pigeon, the rat, and the monkey, measure 
and remember continuous quantities, such 
as duration, as has been shown in a variety 
of experimental paradigms. One of these is 
the so-called peak procedure. In this proce- 
dure, a trial begins with the onset of a stimu- 
lus signaling the possible availability of food 
at the end of a fixed interval, called the feed- 
ing latency. Responses made at or after the 
interval has elapsed trigger the delivery of 
food. Responses prior to that time have no 
consequences. On twenty to fifty percent of 
the trials, food is not delivered. On these tri- 
als, the key remains illuminated, the lever 
remains extended, or the hopper remains il- 
luminated for between four and six times 
longer than the feeding latency. On these tri- 
als, called probe trials, responding after the 
feeding latency has past is pointless. 

Peak-procedure data come from these un- 
rewarded trials. On such trials, the subject 
abruptly begins to respond some time before 
the expected end of the feeding latency and 
continues to peck or press or poke for some 
time after it has passed before abruptly stop- 
ping. The interval during which the subject 
responds brackets its subjective estimate of 
the feeding latency. Representative data are 
shown in Figure 23.2. 

Figure 23.2A shows seemingly smooth 
increases and decreases in the probability 
that the mouse is making an anticipatory re- 
sponse (poking its head into the feeding hop- 
per in anticipation of food delivery) on either 
side of the feeding latency. The smoothness 
is an averaging artifact. On any one trial, the 
onset and offset of anticipatory responding 
is abrupt, but the temporal locus of these 
onsets and offsets varies from trial to trial 
(Church, Meck, & Gibbon, 1994). The peak 
curves in Figure 23.2, like peak curves in 
general, are the cumulative start distribu- 
tions minus the cumulative stop distribu- 
tions, where start and stop refer to the on- 
set and offset of sustained food anticipatory 
behavior. 
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Figure 23.2. Representative peak procedure 
data: Probability that the mouse’s head was in 
the feeding hopper as a function of the time 
elapsed since the beginning of a trial and the 
feeding latency. (The feeding latency varied 
between blocks of trials.) A. The original data. 
These peak curves are the cumulative 
distribution of start times (rising phase) minus 
cumulative distribution of stop times (falling 
phase). These are the raw distributions (no curve 
has been fitted.) B. Same data as in A, data 
replotted as a proportion of the feeding latency. 
Because the variability in the onsets and offsets 
of responding is proportional to the feeding 
latency, as are the location of the means of the 
distributions relative to the target times, the peak 
curves superpose when plotted as a proportion of 
this latency. Data originally published by King, 
McDonald, & Gallistel (2001). 


When the data in Figure 23.2A are re- 
plotted against the proportion of the feed- 
ing latency elapsed, rather than against the 
latency itself, the curves superpose (Fig- 
ure 23.2B). Thus, both the location of the 
distributions relative to the target latency 
and the trial-to-trial variability in the onsets 
and offsets of responding are proportional to 
the remembered latency. Put another way, 


564 THE CAMBRIDGE HANDBOOK OF THINKING AND REASONING 


Préverte diy) hattes (GE tiianaggreenim ber of presses 


i 
Le 
f ‘ 


ho 
= 


Probability 
of Trying Alea 
= 


4 5 12 
Humber of Presses Made 


required toa arm food release 


MH = 16 
a4 


4 hl = 24 
x 


200 240 28 32K 


Figure 23.3. The probability of breaking off to try the feeding 
alcove as a function of the number of presses made on the arming 
lever and the number required to arm the food-release beam at the 
entrance to the feeding alcove. Subjects were rats. Redrawn from 
Platt & Johnson, 1971, with permission. 


the probabilities that the subject will have 
begun to respond or will have stopped re- 
sponding are determined by the proportion 
of the remembered feeding latency that has 
elapsed. This property of remembered dura- 
tions is called scalar variability. 

Rats, pigeons, and monkeys also count 
and remember numerosities (Brannon & 
Roitman, 2003; Church & Meck, 1984; De- 
haene, 1997; Dehaene, Dehaene-Lambertz, 
& Cohen, 1998; Gallistel, 1990; Gallistel 
& Gelman, 2000). One of the early pro- 
tocols for assessing counting and numerical 
memory was developed by Mechner (1958) 
and later used by Platt and Johnson (1971). 
The subject must press a lever some num- 
ber of times (the target number) to arm 
the infrared beam at the entrance to a feed- 
ing alcove. When the beam is armed, inter- 
rupting it releases food. Pressing too many 
times before trying the alcove incurs no 
penalty beyond that of having made super- 
numerary presses. Trying the alcove prema- 
turely incurs a 10-second time-out, which 
the subject must endure before returning to 
the lever to complete the requisite number 
of presses. Data from such an experiment 
are shown in Figure 23.3. They look strik- 
ingly like the temporal data. The number 
of presses at which subjects are maximally 
likely to break off pressing and try the al- 
cove peaks at or slightly beyond the required 
number for required numbers ranging from 
four to twenty four. As the remembered tar- 
get number gets larger, the variability in the 


break-off number also gets proportionately 
greater. Thus, behavior based on number 
also exhibits scalar variability. 

The fact that behavior based on remem- 
bered numerosity exhibits scalar variability 
just like the scalar variability seen in behav- 
ior based on the remembered magnitude of 
continuous quantities such as duration sug- 
gests that numerosity is represented in the 
brains of nonverbal vertebrates by mental 
magnitudes; that is, by entities with the for- 
mal properties of the real numbers, rather 
than by discrete symbols such as words or 
bit patterns. When a device such as an analog 
computer represents numerosities by differ- 
ent voltage levels, noise in the voltages leads 
to confusions between nearby numbers. If 
by contrast, a device represents countable 
quantity by countable (that is, discrete) sym- 
bols, as do digital computers and written 
number systems, then one does not expect 
to see the kind of variability seen in Figures 
23.2 and 23.3. The bit-pattern symbol for 
fifteen is 01111, for example, and for sixteen 
it is 10000. Although the numbers are adja- 
cent in the ordering of the integers, the dis- 
crete binary symbols for them differ in all 
five bits. Jitter in the bits (uncertainty about 
whether a given bit was 0 or 1) would make 
fourteen (01110), thirteen (01101), eleven 
(01011), and seven (00111) all equally and 
maximally likely to be confused with fif- 
teen, because the confusion arises in each 
case from the misreading of one bit. These 
dispersed numbers should be confused with 
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cent sixteen. (For an analysis of the error pat- 
terns to be expected in cascade counters, see 
Killeen & Taylor, 2001). Similarly, a scribe 
copying a handwritten English text is pre- 
sumably more likely to confuse “seven” and 
“eleven” than “seven” and “eight.” The na- 
ture of the variability in a remembered tar- 
get number therefore suggests that what is 
being remembered is a magnitude — some- 
thing that behaves like a continuous quan- 
tity, which is to say something with the for- 
mal properties of a real number. 


Numerosity and Duration Are 
Represented by Comparable 
Mental Magnitudes 


Meck and Church (1983) pointed out that 
the mental accumulator model that Gibbon 
(1977) had proposed to explain the gen- 
eration of mental magnitudes representing 
durations could be modified to make it gen- 
erate mental magnitudes representing nu- 
merosities. Gibbon had proposed that while 
a duration was being timed a stream of im- 
pulses fed an accumulator, so that the ac- 
cumulation grew in proportion to the dura- 
tion of the stream. When the stream ended 
(when timing ceased), the resulting accumu- 
lation was read into memory, where it repre- 
sented the duration of the interval. Meck and 
Church postulated that to get magnitudes 
representing numerosity, the equivalent of a 
pulse former was inserted into the stream of 
impulses, so that for each count there was 
a discrete increment in the contents of the 
accumulator, as happens when a cup of liq- 
uid is poured into a graduated cylinder (Fig- 
ure 23.4). At the end of the count, the re- 
sulting accumulation is read into memory, 
where it represents the numerosity. 

The model in Figure 23.4 is the well- 
known accumulator model for nonverbal 
counting by the successive incrementation 
of mental magnitudes. It is also the origin of 
the hypothesis that the mental magnitudes 
representing duration and the mental mag- 
nitudes representing numerosity are essen- 
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Figure 23.4. The accumulator model for the 
nonverbal counting process. At each count, the 
brain increments a quantity — an operation 
formally equivalent to pouring a cup into a 
graduate. The final magnitude (the contents of 
the graduate at the conclusion of the count) is 
stored in memory, where it represents the 
numerosity of the counted set. Memory is noisy 
(represented by the wave in the graduate), 
which is to say that the values read from 
memory on different occasions vary. The 
variability in the values read from memory is 
proportional to the mean value of the 
distribution (scalar variability). 


tially the same, differing only in the mapping 
process that generates them and, therefore, 
in what it is they refer to. Put another way, 
both numerosity and duration are repre- 
sented mentally by real numbers. Meck and 
Church (1983) compared the psychophysics 
of number and time representation in the rat 
and concluded that the coefficient of varia- 
tion, the ratio between the standard devi- 
ation and the mean, was the same, which 
is further evidence for the hypothesis that 
the same system of real numbers is used in 
both cases. 

The model in Figure 23.4 was originally 
proposed to explain behavior based on the 
numerosity of a set of serial events (for ex- 
ample, the number of responses made), but 
it may be generalized to the case in which 
the items to be counted are presented all at 
once — for example, as a to-be-enumerated 
visual array. In that case, each item in the 
array can be assigned a unit magnitude, and 
the unit magnitudes can then be summed 
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time. Dehaene and Changeux (1993) devel- 
oped a neural net model based on this idea. 
In their model, the activity aroused by each 
item in the array is reduced to a unit amount 
of activity, so that it is no longer proportional 
to the size, contour, and so on, of the item. 
The units of activity corresponding to the 
entities in the array are summed across the 
visual field to yield a mental magnitude rep- 
resenting the numerosity of the array. 


Nonhuman Animals Reason 
Arithmetically 


We have repeatedly referred to the real num- 
ber system because numbers (or magnitudes) 
are truly that only if they are arithmeti- 
cally manipulated. Being causally connected 
to something that can be represented nu- 
merically does not make an entity in the 
brain or anywhere else a number. It also 
must be processed suitably. The defining fea- 
tures of a numerical representation are: (a) 
There is a causal mapping from discrete and 
continuous quantities in the world to the 
numbers. (2) The numbers are arithmeti- 
cally processed. (3) The mapping is usefully 
(validly) invertible: The numbers obtained 
through arithmetic processing correctly re- 
fer through the inverse mapping back to the 
represented reality. 

There is a considerable experimental liter- 
ature demonstrating that laboratory animals 
reason arithmetically with mental magni- 
tudes representing numerosity and duration. 
They add, subtract, divide, and order subjec- 
tive durations and subjective numerosities; 
they divide subjective numerosities by sub- 
jective durations to obtain subjective rates 
of reward; and they multiply subjective rates 
of reward by the subjective magnitudes of 
the rewards to obtain subjective incomes. 
Moreover, the mapping between real mag- 
nitudes and their subjective counterparts is 
such that their mental operations on subjec- 
tive quantities enable these animals to be- 
have effectively. Here we summarize a few 
of the relevant studies. (For reviews, see Boy- 
sen & Hallberg, 2000; Brannon & Roitman, 


Spelke & Dehaene, 1999). 


Adding Numerosities 


Boysen and Berntson (i989) taught chim- 
panzees to pick the Arabic numeral corre- 
sponding to the number of items they ob- 
served. In the last of a series of tests of this 
ability, they had their subjects go around a 
room and observe either caches of actual or- 
anges in two different locations or Arabic 
numerals that substituted for the caches 
themselves. When they returned from a trip, 
the chimps picked the Arabic numeral cor- 
responding to the sum of the two numerosi- 
ties they had seen, whether the numerosities 
had been directly observed (hence, possibly 
counted) or symbolically represented (hence 
not counted). In the latter case, the mag- 
nitudes corresponding to the numerals ob- 
served were presumably retrieved from a 
memory map relating the arbitrary symbols 
for number (the Arabic numerals) to the 
mental magnitudes that naturally represent 
those numbers. Once retrieved, they could 
be added very much like the magnitudes 
generated by the nonverbal counting of the 
caches. (For further evidence that nonver- 
bal vertebrates sum numerical magnitudes, 
see Beran, 2001; Church & Meck, 1984; 
Hauser, 2001, and citations therein; Olthof, 
Iden, & Roberts, 1997; Olthof & Roberts, 
2000; Rumbaugh, Savage-Rumbaugh, & 
Hegel, 1987.) 


Subtracting Durations and Numerosities 


On each trial of the time-left procedure 
(Gibbon & Church, 1981), subjects are of- 
fered an ongoing choice between a steadily 
diminishing delay on the one hand (the time- 
left option) and a fixed delay on the other 
hand (the standard option). At an unpre- 
dictable point in the course of a trial, the 
opportunity to choose ends. Before it gets 
its reward, the subject must then endure the 
delay associated with the option it was ex- 
ercising at that moment. If it was respond- 
ing at the so-called standard station, it must 
endure the standard delay; if it was respond- 
ing at the time-left station, it must endure 


MATHEMATICAL COGNITION 5 67 


the timePedteatethoeptipsnssitibaeryocom In this experiment, subjects chose the 


the time left is much longer than the stan- 
dard delay, but it grows shorter as the trial 
goes on, because the time so far elapsed in 
a trial is subtracted from the initial value 
to yield the time left. When the subjective 
time left is less than the subjective standard, 
subjects switch from the standard option to 
the time-left option. The subjective time 
left is the subjective duration of a remem- 
bered initial duration (subjective initial du- 
ration) minus the subjective duration of the 
interval elapsed since the beginning of the 
trial. In this experiment, therefore, subjects’ 
behavior depends on the subjective order 
ing of a subjective difference and a subjec- 
tive standard (two of the basic arithmetic 
operations). 

In the number-left procedure (Brannon, 
et al., 2001), pigeons peck a center key to 
generate flashes and to activate two choice 
keys. The flashes are generated on a vari- 
able ratio schedule, which means that the 
number of pecks required to generate each 
flash varies randomly between one and eight. 
When the choice keys are activated, the pi- 
geons can get a reward by pecking either 
of them, but only after their pecking gener 
ates the requisite number of flashes. For one 
of the choice keys, the so-called standard key, 
the requisite number is fixed and indepen- 
dent of the number of flashes already gener- 
ated. For the other choice, the number-left 
key, the requisite number is the difference 
between a fixed starting number and the 
tally of flashes already generated by peck- 
ing the center key. The flashes generated by 
pecking a choice key are also delivered on a 
variable ratio schedule. 

The use of variable ratio schedules for 
flash generation partially dissociates time 
and number. The number of pecks required 
to generate any given number of flashes — 
and, hence, the amount of time spent peck- 
ing — varies greatly from trial to trial. This 
makes possible an analysis to determine 
whether subjects’ choices are controlled by 
the time spent pecking the center key or by 
the number of flashes generated. The analy- 
sis shows that it was number, not duration, 
that controlled the pigeons’ choices. 


number-left key when the subjective num- 
ber left was less than some fraction of the 
subjective number of flashes required on the 
standard key. Their behavior therefore was 
controlled by the subjective ordering of a 
subjective numerical difference and a sub- 
jective numerical standard. For an example 
of spontaneous subtraction in monkeys, see 
Sulkowski and Hauser (2001). 

There also is evidence that the mental 
magnitudes representing duration and rates 
are signed — there are both positive and nega- 
tive mental magnitudes (Gallistel & Gibbon, 
2000; Savastano & Miller, 1998). In other 
words, there is evidence for subtraction and 
for the hypothesis that the system for arith- 
metic reasoning with mental magnitudes is 
closed under subtraction. 


Dividing Numerosity by Duration 


When vertebrates, from fish to humans, are 
free to forage in two different nearby lo- 
cations, moving back and forth repeatedly 
between them, the ratio of the expected 
durations of the stays in the two locations 
matches the ratios of the numbers of rewards 
obtained per unit of time (Herrnstein, 1961). 
Until recently, it had been assumed that this 
matching behavior depended on the law of 
effect. When subjects do not match, they get 
more reward per unit of time invested in one 
patch than per unit of time invested in the 
other. Only when they match do they get 
equal returns on their investments. Match- 
ing therefore could be explained on the as- 
sumption that subjects try different ratios 
of investments (different ratios of expected 
stay durations) until they discover the ra- 
tio that equates the returns (Herrnstein & 
Vaughan, 1980). 

Gallistel et al. (2001) showed that rats 
adjust to changes in the scheduled rates of 
reward as fast as it is in principle possi- 
ble to do so; they are ideal detectors of 
such changes. They could not adjust so 
rapidly if they were discovering by trial 
and error the ratio of expected stay dura- 
tions that equated their returns. The im- 
portance of this in the present context 
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a discrete or countable quantity, which is the 
kind of thing naturally represented by pos- 
itive integers — divided by a continuous or 
(uncountable) quantity — the duration of the 
given interval, which is the kind of thing that 
can be represented only by a real number. 
Gallistel and Gibbon (2000) review the 
evidence that both Pavlovian and instrumen- 
tal conditioning depend on subjects’ estimat- 
ing rates of reward. They argue that rate of 
reward is the fundamental variable in con- 
ditioned behavior. The importance of this 
in the present context is twofold. First, it 
is evidence that subjects divide mental mag- 
nitudes. Second, it shows why it is essen- 
tial that countable and uncountable quantity 
be represented by commensurable mental 
symbols — symbols that are part of the 
same system and can be combined arithmeti- 
cally without regard to whether they repre- 
sent countable or uncountable quantity. If 
countable quantity were represented by one 
system (say, a system of discretely ordered 
symbols, formally analogous to the list of 
counting words) and uncountable (continu- 
ous) quantity by a different system (a system 
of mental magnitudes), it would not be pos- 
sible to estimate rates. The brain would have 
to divide mental apples by mental oranges.’ 


Multiplying Rate by Magnitude 


When the magnitudes of the rewards ob- 
tained in two different locations differ, then 
the ratio of the expected stay durations 
is determined by the ratio of the incomes 
obtained from the two locations (Catania, 
1963; Harper, 1982; Keller & Gollub, 1977; 
Leon & Gallistel, 1998). The income from 
a location is the product of the rate and the 
reward magnitude. This result implies that 
subjects multiply subjective rates by sub- 
jective magnitudes to obtain subjective in- 
comes. The signature of multiplicative com- 
bination is that changing one variable by 
a given factor — for example, doubling the 
rate — changes the product by the same factor 
(doubles the income) regardless of the value 
of the other factor (the magnitude of the 
rewards). Leon and Gallistel (1998) showed 


by a given factor changed the ratio of the ex- 
pected stay durations by that factor, regard- 
less of the ratio of the reward magnitudes, 
thereby proving that subjective magnitudes 
combine multiplicatively with subjective 
rates to determine the ratio of expected stay 
durations. 


Ordering Numerosities 


Most of the paradigms that demonstrate 
mental addition, subtraction, multiplication, 
and division also demonstrate the order- 
ing of mental magnitudes, because the sub- 
ject’s choice depends on this ordering. Bran- 
non and Terrace (2000) demonstrated more 
directly that monkeys order numerosities 
by presenting simultaneously several arrays 
differing in the numerosity of the items 
constituting each array and requiring their 
macaque subjects to touch the arrays in 
the order of their numerosity. When sub- 
jects had learned to do this for numerosi- 
ties between one and four, they generalized 
immediately to numerosities between five 
and nine. 

The most interesting feature of Brannon 
and Terrace’s results was that they found it 
impossible to teach subjects to touch the ar- 
rays in an order that did not conform to the 
order of the numerosities (either ascending 
or descending). This implies that the order- 
ing of numerosities is highly salient for a 
monkey. It cannot ignore their natural order- 
ing to learn an unnatural one. It also suggests 
that the natural ordering is not itself learned; 
it is inherent in the monkey’s representation 
of numerosity. What is learned is to respond 
on the basis of numerical order, not the or- 
dering itself. 

For further evidence that nonverbal ver- 
tebrates order numerosities and durations, 
see Biro and Matsuzawa (2001), Brannon 
and Roitman (2003), Brannon and Terrace 
(2002), Carr and Wilkie (1997), Olthof, 
Iden, and Roberts (1997), Rumbaugh and 
Washburn (1993), and Washburn and Rum- 
baugh (1991). 

In summary, research with vertebrates, 
some of which have not shared a common 
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of the dinosaurs, implies that they represent 
both countable and uncountable quantity by 
means of mental magnitudes. The system 
of arithmetic reasoning with these mental 
magnitudes is closed under the basic opera- 
tions of arithmetic; that is, mental magni- 
tudes may be mentally added, subtracted, 
multiplied, divided, and ordered without 
restriction. 


Humans Also Represent Numerosity 
with Mental Magnitudes 


The Symbolic Size and Distance Effects 


It would be odd if humans did not share 
with their remote vertebrate cousins (pi- 
geons) and near vertebrate cousins (chim- 
panzees) the mental machinery for repre- 
senting countable and uncountable quantity 
by means of a system of real numbers. That 
humans do represent integers with mental 
magnitudes was first suggested by Moyer 
and Landauer (1967; 1973) when they dis- 
covered what has come to be called the 
symbolic distance effect (Figure 23.5). When 
subjects are asked to judge the numerical or 
der of Arabic numerals as rapidly as possible, 
their reaction time is determined by the rela- 
tive numerical distance: The greater the dis- 
tance between the two numbers, the more 
quickly their order may be judged. Sub- 
sequently, Parkman (1971) further showed 
that the greater the numerical value of the 
smaller digit, the longer it takes to judge 
their order (the size effect). The two effects 
together may be summarized under a sin- 
gle law, namely that the time to judge the 
numerical order of two numerals is a func- 
tion of the ratio of the numerical magni- 
tudes they represent. Weber’s law that the 
ability of two magnitudes to be discrimi- 
nated is a function of their ratio therefore ap- 
plies to symbolically represented numerical 
magnitude. 

The size and distance effects in human 
judgments of the ordering of discrete and 
continuous quantities are robust. They are 
observed when the numerosities being com- 
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Figure 23.5. The symbolic and nonsymbolic size 
and distance effects on the human reaction time 
while judging numerical order in the range from 1 
to g. In three of the conditions, the numerosities 
to be judged were instantiated by two dot arrays 
(nonsymbolic numerical ordering). The dots 
within each array were in either a regular 
configuration, an irregular configuration that did 
not vary upon repeated presentation, or in 
randomly varying configurations. In the fourth 
condition, the numerosities were represented 
symbolically by Arabic numerals. The top panel 
plots mean reaction times as a function of the 
numerical difference. The bottom plots it as a 
function of the size of the smaller comparand. 
Replotted from Figures 23.1 and 23.2 in Buckley 
& Gillman, 1974. 


pared are actually instantiated (by visual ar- 
rays of dots) and when they are represented 
symbolically by Arabic numerals (Buckley & 
Gillman, 1974). The symbolic distance and 
size effects are observed in the single-digit 
range and in the double-digit range (De- 
haene, Dupoux, & Mehler, 1990; Hinrichs, 
Yurko, & Hu, 1981). That this effect of nu- 
merical magnitude on the time to make an 
order judgment should appear for symbol- 
ically represented numerosities between 1 
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Figure 23.6. The reaction time and accuracy functions for monkey (Rhesus 
macaque) and human subjects in touching the more numerous of two random 
dot visual arrays presented side by side on a touch-screen video monitor. 
Reproduced from Brannon & Terrace, 2002 with permission. 


and 100 is decidedly counterintuitive. If in- 
trospection were any guide to what one’s 
brain was doing, one would think that the 
facts about which numbers are greater than 
which are stored in a table of some kind and 
simply looked up. In that case, why would it 
take longer to look up the ordering of 2 and 
3 (or 65 and 62) than 2 ands5 (or 65 and 47)? 
It does, however, and this suggests that the 
comparison that underlies these judgments 
operates with noisy mental magnitudes. Ac- 
cording to this hypothesis the brain maps 
the numerals to the noisy mental magnitudes 
that would be generated by the nonverbal 
numerical estimation system if it enumer- 
ated the corresponding numerosity. It then 
compares those two noisy mental magni- 
tudes to decide which numeral represents 
the bigger numerosity. 

On this hypothesis, the comparison that 
mediates the verbal judgment of the numer- 
ical ordering of two Arabic numerals uses 
the same mental magnitudes and the same 
comparison mechanism as that used by the 
nonverbal numerical reasoning system that 
we are assumed to share with many nonver- 
bal animals. Consistent with this hypothe- 
sis is Brannon and Terrace’s (2002) finding 
that reaction time functions from humans 


and monkeys for judgments of the numer- 
ical ordering of pairs of visually presented 
dot arrays are almost exactly the same (Fig- 
ure 23.6). 

Buckley and Gillman (1974) modeled 
the underlying comparison process. In their 
model, numbers are represented in the brain 
by noisy signals (mental magnitudes) with 
overlapping distributions. The closer two 
numerosities are in the ordering of nu- 
merosities, the more their corresponding sig- 
nal distributions overlap. When the subject 
judges the ordering of two numerosities, the 
brain subtracts the signal representing the 
one numerosity from the signal represent- 
ing the other, and puts the signed difference 
in an accumulator — a mechanism that adds 
up inputs over, in this case, time. The accu- 
mulator for the ordering operation has fixed 
positive and negative thresholds. When its 
positive threshold is exceeded, it reports the 
one number to be greater than the other 
and vice versa when its negative threshold 
is exceeded. If neither accumulator thresh- 
old is exceeded, the comparator resamples 
the two signals, computes a second differ- 
ence, based on the two new samples, and 
adds it to the accumulator. The resampling 
explains why it takes longer (on average) to 
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they are, the more their corresponding signal 
distributions overlap. The more these distri- 
butions overlap, the more samples will have 
to be made and added together (accumu- 
lated) before (on average) a decision thresh- 


old is reached. 


Nonverbal Counting in Humans 


Given the evidence from the symbolic size 
and distance effects that humans represent 
number with mental magnitudes, it seems 
likely that they share with the nonverbal 
animals in the vertebrate clade a nonver- 
bal counting mechanism that maps from nu- 
merosities to the mental magnitudes that 
represent them. If so, then it should be pos- 
sible to demonstrate nonverbal counting in 
humans when verbal counting is suppressed. 
Whalen, Gallistel, and Gelman (1999) pre- 
sented subjects with Arabic numerals on a 
computer screen and asked them to press a 
key as fast as they could without counting 
until it felt like they had pressed the number 
signified by the numeral. The results from 
humans looked very much like the results 
from pigeons and rats: The mean number of 
presses increased in proportion to the target 
number and the standard deviations of the 
distributions of presses increased in propor- 
tion to their mean, so that the coefficient of 
variation was constant. 

This result suggests, first, that subjects 
could count nonverbally, and, second, that 
they could compare the mental magnitude 
thus generated to a magnitude obtained us- 
ing a learned mapping from numerals to 
mental magnitudes. Finally, it implies that 
the mapping from numerals to mental mag- 
nitudes is such that the mental magnitude 
given by this mapping approximates the 
mental magnitude generated by counting 
the numerosity signified by a given numeral. 

In a second task, subjects observed a dot 
flashing very rapidly but at irregular inter- 
vals. The rate of flashing (eight per sec- 
ond) was twice as fast as estimates of the 
maximum speed of verbal counting (Man- 


dler & Shebo, 1982). Subjects were asked 
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times they thought the dot had flashed. As 
in the first experiment, the mean number es- 
timated increased in proportion to the num- 
ber of flashes and the standard deviation of 
the estimates increased in proportion to the 
mean estimate. This implies that the map- 
ping between the mental magnitudes gener- 
ated by nonverbal counting and the verbal 
symbols for numerosities is bidirectional; it 
can go from a symbol to a mental magnitude 
that is comparable to the one that would be 
generated by nonverbal counting, and it can 
go from the mental magnitude generated by 
a nonverbal count to a roughly correspond- 
ing verbal symbol. In both cases, the variabil- 
ity in the mapping is scalar. 

Whalen et al. (1999) gave several rea- 
sons for believing that their subjects did not 
count subvocally. We will not review them 
here, because a subsequent experiment 
speaks more directly to this issue (Cordes 
et al., 2001). 

Cordes et al. (2001) suppressed articula- 
tion by having their subjects repeat a com- 
mon phrase (“Mary had a little lamb”) while 
they attempted to press a target number of 
times, or by having subjects say “the” co- 
incident with each press. In control exper- 
iments, subjects were asked to count their 
presses out loud. In all conditions, subjects 
were asked to press as fast as possible. 

The variability data from the condition 
under which subjects were required to say 
“the” coincident with each press are shown 
in Figure 23.7 (filled squares). As in Whalen 
et al. (1999), the coefficient of variation was 
constant (scalar variability). The best-fitting 
line has a slope that does not differ signif- 
icantly from zero. The contrasting results 
from the control conditions, in which sub- 
jects counted out loud, are the open squares. 
Here, the slope — on this log—log plot — does 
deviate very significantly from zero. In ver- 
bal counting, one would expect counting 
errors — double counts and skips — to be the 
most common source of variability. On the 
assumption that the probability of a count- 
ing error is approximately the same at suc- 
cessive steps in a count, the resulting vari- 
ability in final counts should be binomial 
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Figure 23.7. The coefficients of variation (0 /) 
are plotted against the numbers of presses for 
the conditions in which subjects counted 
nonverbally and for the condition in which they 
fully pronounced each count word (double 
logarithmic coordinates). In the former 
condition, there is scalar variability; that is, a 
constant coefficient of variation. The slope of 
the regression line relating the log of the 
coefficient of variation to the log of mean 
number of presses does not differ from zero. In 
the latter, the variability is much less and it is 
binomial; the coefficient of variation decreases 
in proportion to the square root of the target 
number. In the latter case, the slope of the 
regression line relating the log of the coefficient 
of variation to the log of the mean number of 
presses differs significantly from zero but does 
not differ significantly from —o.5, which is the 
slope predicted by the binomial variability 
hypothesis. Reproduced from Cordes et al., 
2001, with permission. 


rather than scalar. It should increase in pro- 
portion to the square root of the target value, 
rather than in proportion to the target value. 
If the variability is binomial rather than 
scalar, then when the coefficient of variation 
is plotted against the target number on a log— 
log plot, it should form a straight line with a 
slope of —o.5. This, in fact, is what was ob- 
served in the out-loud counting conditions: 
The variability was much less than in the 


PliMianaryaeiverbal counting conditions and, more 


importantly, it was binomial rather than 
scalar. The mean slope of the subject-by- 
subject regression lines in the two control 
conditions was significantly less than zero 
and not significantly different from —o.5. 
The contrasting patterns of variability in the 
counting-out-loud and nonverbal counting 
conditions strengthen the evidence against 
the hypothesis that subjects in the non- 
verbal counting conditions were counting 
subvocally. 

In sum, nonverbal counting may be de- 
monstrated in humans, and it looks just like 
nonverbal counting in nonhumans. More- 
over, mental magnitudes (real numbers) 
comparable to those generated by nonver- 
bal counting appear to mediate judgments of 
the numerical ordering of symbolically pre- 
sented integers. This suggests that the non- 
verbal counting system is what underlies and 
gives meaning to the linguistic representa- 
tion of numerosity. 


Nonverbal Arithmetic Reasoning 
in Humans 


In humans, as in other animals, nonver- 
bal counting would be pointless if they did 
not reason arithmetically with the result- 
ing mental magnitudes. Recent experiments 
give evidence that they can. 

Barth (2001; see also Barth et al., under 
review 2004) tested adults’ performance on 
tasks that required the addition, subtraction, 
multiplication, and division of nonverbally 
estimated numerosities, under conditions in 
which verbally mediated arithmetic was un- 
likely. Subjects were given instances of two 
numerosities in rapid sequence, each in- 
stance presented too quickly to be countable 
verbally. Then, they were given an instance 
of a third numerosity, and they indicated 
by pressing one of two buttons whether the 
sum, or difference, or product, or quotient 
of the first two numerosities was greater or 
less than the third. 

The numerosities were presented either 
as dot arrays (with dot density and area 
covered controlled) or as tone sequences. 
In some conditions, presentation modalities 
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Figure 23.8. The accuracy of order judgments 
for two nonverbally estimated numerosities. The 
estimates of numerosity were based on direct 
instantiations in the first condition (Ni < Nz). 
In the other conditions, one of them was derived 
from the composition of two other estimates. 
Data replotted from Barth, 2001, p. 109. 


were mixed, so, for example, subjects com- 
pared the sum of a tone sequence and a dot 
array to either another tone sequence or an- 
other dot array. 

In Barth’s results, there was no effect of 
comparand magnitude on reaction time or 
accuracy, only an effect of their ratio. That 
is, it did not matter how big the two nu- 
merosities were; only the proportion of the 
smaller to the larger affected reaction time 
and accuracy. The same proved to be true in 
Barth’s experiments involving mental mag- 
nitudes derived by arithmetic composition. 
This enables a comparison between the case 
in which the comparands are both given di- 
rectly and the case in which one comparand 
is the estimated sum or difference of two es- 
timated numerosities. As Figure 23.8 shows, 
the accuracy of comparisons involving a sum 
was only slightly less at each ratio of the com- 
parands than the accuracy of a comparison 
between directly given comparands. 

At a given comparand ratio, the accu- 
racy of comparisons involving differences 
was less than the accuracy of a comparison 
between directly given comparands (Figure 
23.8). This could hardly be otherwise. For 
addition, the sum increases as the magni- 
tude of the pair of operands increases, but 
for subtraction, it does not; the difference 
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only one. The uncertainty (estimation noise) 
in the operands must propagate to the result 
of the operation, so the uncertainty about 
the true value of a difference must depend 
in no small measure on the magnitude of the 
operands from which it derived. If one looks 
only at the ratio of the difference to the other 
comparand, one fails to take account of the 
presumably inescapable impact of operand 
magnitude on the noise in the difference. 

Barth’s experiments establish by direct 
test the human ability to combine noisy 
nonverbal estimates of numerosity in accord 
with the combinatorial operations that de- 
fine the system of arithmetic. In her data 
(Figure 23.8), as the proportion between the 
smaller and larger comparand increases to- 
ward unity, the accuracy of the comparisons 
degrades in a roughly parallel fashion regard- 
less of the derivation of the first comparand. 
This suggests that the scalar variability in 
the nonverbal estimates of numerosity prop- 
agates to the mental magnitudes produced 
by the composition of those estimates. 

Barth’s data, however, do not directly 
demonstrate the variablity in the results of 
composition nor allow one to estimate the 
quantitative relation between the noise in 
the operands and the noise in the result. 
Cordes et al. (submitted 2004) used the 
previously described key-tapping paradigm 
to demonstrate the nonverbal addition and 
subtraction of nonverbal numerical esti- 
mates and the quantitative relation between 
the variability in the estimates of the sums 
and differences and the variability in the es- 
timates of the operands. 

In the baseline condition of the Cordes 
et al. (submitted 2004) experiment, sub- 
jects saw a sequence of rapid, arrhythmic, 
variable-duration dot flashes on a computer 
screen at the conclusion of which they at- 
tempted to make an equivalent number of 
taps on one button of a two-button response 
box, tapping as rapidly as they could while 
saying the out loud coincident with each 
tap. In the compositional conditions, sub- 
jects saw one sequence on the left side of the 
screen, a second sequence on the right side, 
and were asked to tap out either the sum or 
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they pressed the button on the side they be- 
lieved to have had the fewer flashes as many 
times as they felt were required to make up 
the difference. 

Sample results are shown in Figure 23.9. 
The numbers of responses subjects made, 
in all cases, were approximately linear func- 
tions of the numbers they were estimating, 
demonstrating the subjects’ ability to add 
and subtract the mental magnitudes repre- 
senting numerosities. In the baseline condi- 
tion, the variability in the numbers tapped 
out was an approximately scalar function of 
the target number, although there was some 
additive and binomial variability. 

The variability in the addition data was 
also, to a first approximation, a scalar func- 
tion of the objective sum. Not surprisingly, 
however, the variability in the subtraction 
data was not. In addition, answer magni- 
tude covaries with operand magnitude: The 
greater the magnitude of the operands, the 
greater the magnitude of their sum.> In sub- 
traction, answer magnitude is poorly corre- 
lated with operand magnitude because large- 
magnitude operands often produce small 
differences. Insofar as the scalar variability in 
the estimates of operand magnitudes prop- 
agates to the variability in the results of the 
operations, there will be large variability in 
these small differences. 

Cordes et al. (submitted 2004) fit regres- 
sion models with additive, binomial, and 
scalar variance parameters to the baseline 
data, and to the addition and subtraction 
data. These fits enabled them to assess the 
extent to which the magnitude of the pair 
of operands predicted the variability in their 
sum and difference. On the assumption that 
there is no covariance in the operands, the 
variance in the results of both subtraction 
and addition should be equal to the sum of 
the variances for the two operands. When 
Cordes et al. plotted predicted variabilty 
against directly estimated variability (Figure 
23.9D), they found that the subtraction data 
did conform approximately to expectations 
but that the addition data clearly fell above 
the line. In other words, the variability in re- 
sults of subtraction was approximately what 


variances in the operands, but the variabil- 
ity in the addition results was greater than 
expected. 


Retrieving Number Facts 


There is an extensive literature on reac- 
tion times and error rates in adults do- 
ing single-digit arithmetic (Ashcraft, 1992; 
Campbell, 1999; Campbell & Fugelsang, 
2001; Campbell & Gunter, 2002; Camp- 
bell, 2005; Campbell & Fugelsang, 2001; 
Noel, 2001). It resists easy summary. How- 
ever, magnitude effects analogous to those 
found for order judgments are a salient and 
robust finding: The bigger the numerosities 
represented by a pair of digits, the longer 
it takes to recall their sum or product and 
the greater the likelihood of an erroneous 
recall. The same is true in children (Camp- 
bell & Graham, 1985). For both sets of num- 
ber facts, there is a notable exception to this 
generalization. The sums and products of 
ties (for example, 4 + 4 or 9 x g) are re- 
called much faster than is predicted by the 
regressions for non-ties, although ties, too, 
show a magnitude effect (Miller, Perlmutter, 
& Keating, 1984). 

There is a striking similarity in the effect 
of operand magnitude on the reaction times 
for both addition and multiplication. The 
slopes of the regression lines (reaction time 
versus the sum or product of the numbers in- 
volved) are not statistically different (Geary, 
Widman, & Little, 1986). More importantly, 
Miller, Perlmutter, & Keating (1984) found 
that the best predictor of reaction times for 
digit multiplication problems was the reac- 
tion times for digit addition problems, and 
vice versa. In other words, the reaction-time 
data for these two different sets of facts, 
which are mastered at different ages, show 
very similar microstructure. 

These findings suggest a critical role 
for mental magnitudes in the retrieval 
of the basic number facts (the addition 
and multiplication tables) upon which ver- 
bally mediated computation strategies de- 
pend. Whalen’s (1997) diamond arithmetic 
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Figure 23.9. A. Number of responses (key taps) as a function of the number of flashes for one subject. 
B. Number of responses as a function of the sum of the numbers of flashes in two flash sequences. C. 
Number and sign (side) of the responses as a function of the difference between the numbers of 
flashes in two sequences of flashes. D. Predicting the variability in the sums and differences from the 
variability in the operands. Adapted from Cordes et al. (under review 2004) with permission. 


experiment showed that these effects de- 
pend primarily on the magnitude of the 
operands, not on the magnitude of the an- 
swers, nor on the frequency with which dif- 
ferent facts are retrieved (although these 
may also contribute). Whalen (1997) taught 
subjects a new arithmetic operation of his 
own devising, the diamond operation. It was 
such that there was no correlation between 
operand magnitude and answer magnitude. 
Subjects received equal practice on each 
fact, so explanations in terms of differen- 
tial practice did not apply. When subjects 
had achieved a high level of proficiency at 
retrieving the diamond facts, Whalen mea- 
sured their reaction times. He obtained the 
same pattern of results seen in the retrieval 
of the facts of addition and multiplication. 


Two Issues 


What is the Form of Mapping from 
Magnitudes to Mental Magnitudes? 


Weber’s law, that the discriminability of two 
magnitudes (two sound intensities or two 
light intensities) is a function of their ra- 
tio, is the oldest and best established quan- 
titative law in experimental psychology. Its 
implications for the question of the quanti- 
tative relation between directly measurable 
magnitudes (hereafter called objective magni- 
tudes) and the mental magnitudes by which 
they are represented (hereafter called sub- 
jective magnitudes) have been the subject of 
analysis and debate for more than a cen- 
tury. This line of investigation led to work on 
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ment, work concerning the question of what 
it means to measure something (Krantz et 
al., 1971; Krantz, 1972; Luce, 1990; Stevens, 
1951, 1970). The key insight from work 
on the foundations of measurement is that 
the quantitative form of the mapping from 
things to their numerical representives can- 
not be separated from the question of the 
arithmetic operations that are validly per- 
formed on the results of that mapping. The 
question of the form of the mapping is mean- 
ingful only at the point at which the num- 
bers (magnitudes) produced by the mapping 
enter into arithmetic operations. 

The discussion began when Fechner used 
Weber’s results to argue that subjective mag- 
nitudes (for example, loudness and bright- 
ness) are logarithmically related to the 
corresponding objective magnitudes (sound 
and light intensity). Fechner’s reasoning is 
echoed to the present day by authors who 
assume that Weber’s law implies logarithmic 
compression in the mapping from objective 
numerosity to subjective numerosity. These 
conjectures are uninformed by the literature 
on the measurement of subjective quantities 
spawned by Fechner’s assumption. In deriv- 
ing logarithmic compression from Weber’s 
law, Fechner assumed that equally discrim- 
inable differences in objective magnitude 
correspond to equal differences in subjec- 
tive magnitude. When you directly ask sub- 
jects whether they think just discriminable 
differences in, for example, loudness, repre- 
sent equal differences, however, they do not; 
they think a just discriminable difference be- 
tween two loud sounds is greater than the 
just discriminable difference between two 
soft sounds (Stevens, 1951). 

The reader will recognize that Barth per- 
formed both experiments — the discrimina- 
tion experiment (Weber’s experiment) and 
the difference judging experiment — but with 
numerosities instead of noises. In the dis- 
crimination experiment, she found that We- 
ber’s law applied: Two pairs of nonverbally 
estimated numerosities can be correctly or- 
dered 75% of the time when N,/N, = 
N,;/N, = .83, where N now refers to the 
(objective) numerosity of a set (Figure 23.8). 


present (Dehaene, 2002), this has been taken 
to imply that subjective numerosity is a log- 
arithmic function of objective numerosity. 
If that were so, and if subjects estimated 
the arithmetic differences between objec- 
tive magnitudes from the arithmetic differ- 
ences in the corresponding subjective mag- 
nitudes, then the Barth (2001) and Cordes 
et al. (submitted 2004) subtraction exper- 
iments would have failed, and so would 
the experiments demonstrating subtraction 
of time and number in nonverbal animals, 
because the arithmetic difference between 
the logarithms of two magnitudes repre- 
sents their quotient, not their arithmetic 
difference. 

In short, when subjects respond appropri- 
ately to the arithmetic difference between 
two numerical magnitudes, their behavior is 
not based on the arithmetic difference be- 
tween mental (subjective) magnitudes that 
are proportional to the logarithms of the ob- 
jective magnitudes. That much is clear. Ei- 
ther (Model 1): The behavior is based on the 
arithmetic difference in mental magnitudes 
that are proportional to the objective mag- 
nitudes (a proportional rather than loga- 
rithmic mapping). Or (Model 2): Dehaene 
(2001) has suggested that mental magni- 
tudes are proportional to the logarithms 
of objective magnitudes and that, to ob- 
tain from them the mental magnitude cor- 
responding to the objective difference, the 
brain uses a look-up table, a procedure anal- 
ogous to the procedure that Whalen’s (1997) 
subjects used to retrieve the facts of diamond 
arithmetic. In this model, the arithmetic dif- 
ference between two mental magnitudes is 
irrelevant; the two magnitudes serve only 
to specify where to enter the look-up ta- 
ble — where in memory the answer is to 
be found. 

In summary, there are two intimately in- 
terrelated unknowns concerning the map- 
ping from objective to subjective magni- 
tudes — the form of the mapping and the 
formal character of the operations on the re- 
sults of the mapping. Given the experimen- 
tal evidence showing valid arithmetic pro- 
cessing, knowing either would fix the other. 
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either, can behavioral experimental evidence 
decide between the alternative models? Per- 
haps not definitively, but there are relevant 
considerations. The Cordes et al. (submit- 
ted 2004) experiment estimates the noise in 
the results of the mental subtraction oper- 
ation at and around zero difference (Figure 
23.9C). There is nothing unusual about the 
noise around answers of approximately zero. 
It is unclear what assumptions about noise 
would enable a logarithmic mapping model 
to explain this. The logarithm of a quantity 
goes to minus infinity as the quantity ap- 
proaches zero, and there are no logarithms 
for negative quantities. On the assumption 
that realizable mental magnitudes, like real- 
izable nonmental magnitudes, cannot be in- 
finite, the model has to treat zero as a spe- 
cial case. How the treatment of that special 
case could exhibit noise characteristics of a 
piece with the noise well away from zero 
is unclear. 

It is also unclear how the logarithmic- 
mapping-plus-table-lookup model can deal 
with the fact that the sign of a difference 
is not predictable a priori. In this model, 
a bigger magnitude (number) cannot be 
subtracted from a smaller, because the re- 
sulting negative number does not have a 
logarithm; there is no way to represent a 
negative magnitude in a scheme in which 
magnitudes are represented by their loga- 
rithms. Thus, this model is not closed under 
subtraction. 


Is There a Distinct Representation for 
Small Numbers? 


When instantiated as arrays of randomly ar- 
ranged small dots, presented for a fraction 
of a second, small numerosities can be es- 
timated more quickly than large ones, but 
only up to about six. Thereafter, the esti- 
mates increase more or less linearly with the 
number of dots, but the reaction time is flat 
(Figure 23.10). 

Subjects’ confidence in their estimates 
also falls off precipitously after six (Kauf- 
man et al., 1949; Taves, 1941). This led Taves 
to argue that the processes by which sub- 
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Figure 23.10. Estimates of dot numerosity (top) 
and time to make an estimate (bottom) as 
functions of the number of dots in 
tachistoscopically presented arrays of randomly 
positioned dots. Plotted from the data for the 
speeded instruction group in Table 1 of Kaufman 


et al., 1949, p. 510. 


jects arrive at estimates for numerosities of 
five or fewer are distinct from the processes 
by which they arrive at estimates for nu- 
merosities of seven or more. Kaufman et 
al. (1949) coined the term subitizing to de- 
scribe the process that operates in the range 
below six. 

When the dot array to be enumerated is 
displayed until the subject responds, rather 
than very briefly by a tachistoscope, the re- 
action time function is superimposable on 
the one shown in Figure 23.10, up to and 
including numerosity six. It does not level 
off at six, however; rather, it continues with 
the same slope (about 325 ms/dot) indefi- 
nitely (Jensen, Reese, & Reese, 1950). This 
slope represents the time it takes to count 
subvocally. The discontinuity at six there- 
fore represents the point at which a non- 
verbal numerosity—estimating mechanism or 
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verbal counting, because, presumably, it is 
not possible to count verbally more than six 
items under tachistoscopic conditions. 

The nonverbal numerosity-estimating 
process is probably the basis for the demon- 
strated capacity of humans to compare (or- 
der) large numerosities instantiated either 
visually or auditorily (Barth, Kanwisher, & 
Spelke, 2003). The reaction times and accu- 
racies for these comparisons show the We- 
ber law characteristic, which is a signature 
of the process that represents numerosities 
by mental magnitudes rather than by dis- 
crete wordlike symbols (Cordes et al., 2001). 
The assumption that the representation is by 
mental magnitudes regardless of the mode 
of presentation is consistent with the finding 
that there is no cost to cross-modal compar- 
isons of large numerosities; these compar- 
isons take no longer and are no more inaccu- 
rate than comparisons within presentation 
modes (Barth et al., 2003). 

There is controversy about the implica- 
tions of the reaction time function within 
the subitizing range below six. In this range, 
there is approximately a 30-ms increment in 
going from one to two dots, an 80-ms incre- 
ment in going from two to three, and a 200- 
ms increment in going from three to four. 
These are large increments. The net incre- 
ment from one to four is about 300 ms, half 
the total latency to respond to a one-item ar- 
ray (Jensen, Reese, & Reese 1950; Kaufman 
et al., 1949; Mandler & Shebo, 1982). More- 
over, the increments increase at each step. 
In particular, the step from two to three is 
significantly greater than the step from one 
to two in almost every data set. 

It is often claimed that there is a discon- 
tinuity in the reaction time function within 
the subitizing range (Davis & Pérusse, 1988; 
Klahr & Wallace, 1973; Piazza et al., 2003; 
Simon, 1999; Strauss & Curtis, 1984; Wood- 
worth & Schlosberg, 1954); but it also often 
has been pointed out that there is no em- 
pirical support for this claim (Balakrishnan 
& Ashby, 1992). Because the reaction time 
function is neither flat nor linear in the range 
from one to three, it offers no support for 
the common theory that very small numbers 


out by the authors who coined the term 
subitizing (Kaufman, et al., 1949). 

Gallistel and Gelman (i992) and De- 
haene and Cohen (i994) suggested that, 
in the subitizing range, there is a transi- 
tion from a strategy based on mapping from 
nonverbally estimated mental magnitudes to 
a strategy based on verbal counting. This 
hypothesis has recently received important 
support from a paper by Whalen and West 
(2001). By strongly encouraging rapid, ap- 
proximate estimates and taking measures 
to make verbal counting more difficult, 
Whalen & West (2001) obtained a reaction 
time function with a slope of 47 ms per item, 
from one to sixteen items. 

The coefficient of variation in the esti- 
mated numbers was constant from 1 to16, at 
about 14.5%, which is close to the value of 
16% in the animal timing literature (Gallis- 
tel, King, & McDonald, 2004). The Whalen 
et al. data therefore show scalar variabil- 
ity in rapid number estimates all the way 
down to estimates of one and two, as do 
the data of Cordes et al. (2001). Whalen, 
& West (2001) show that with this level 
of noise in the mental magnitudes being 
mapped to number words, the expected 
percent errors in the resulting verbal esti- 
mates of numerosity are close to zero in 
the range one to three and increase rapidly 
thereafter — in close accord with the exper- 
imentally observed percent errors in their 
speeded condition (Figure 23.11). This ex- 
plains why subjects in experiments in which 
it is not strongly discouraged switch to sub- 
vocal verbal counting somewhere between 
four and six, and why their confidence in 
their speeded estimates falls off rapidly af- 
ter six (Kaufman et al., 1949; Taves, 1941). 
Whalen et al. (under review) attribute the 
constant slope of 47 ms/item in the speeded 
reaction time function to a serial nonver- 
bal counting process. In short, the reaction 
time function does not support the hypoth- 
esis that there are percepts of twoness and 
threeness, constituting a representation of 
small numerosities incommensurable with 
the mental magnitudes that represent other 
numerosities. 
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Figure 23.11. The observed percent errors as a 
function of number of dots in Whalen’s speeded 
condition compared with the percent expected 
from the hypothesis that the estimates were 
obtained by way of a mapping from nonverbal 
mental magnitudes to the corresponding number 
words and that the mental magnitudes had scalar 
variability with a coefficient of variation of 0.145. 
Reproduced from Whalen et al. (under review) 
with permission. 


The Development of Verbal 
Numerical Competence 


It appears that the system of nonverbal men- 
tal magnitudes plays a fundamental role 
in verbal numerical behavior: When verbal 
counting is too slow to satisfy time con- 
straints, it mediates the finding of a number 
word that specifies approximately the nu- 
merosity of a set. It mediates the ordering of 
the symbolic numbers and the numerosities 
they represent. And it mediates the retrieval 
of the verbal number facts (the addition and 
multiplication tables) upon which verbal 
computational procedures rest. All of these 
roles require a mapping between the men- 
tal magnitudes that represent numerosity 
and number words and written numerals. In 
the course of ordinary development, there- 
fore, humans learn a bidirectional mapping 
between the mental magnitudes that repre- 
sent numerosity and the words and numerals 
that represent numerosity (Gallistel & Gel- 
man, 1992; Gelman & Cordes, 2001). They 
make use of this bidirectional mapping in 
talking about number and the effects of com- 
binatorial operations with numbers. There is 


the literature on numerical cognition be- 
cause of the abundant evidence for Weber- 
law characteristics in symbolic numerical 
behavior. The literature on the deficits in nu- 
merical reasoning seen in brain-injured pa- 
tients is broadly consistent with this same 
conclusion (Dehaene, 1997; Noel, 2001). 

It also seems plausible that the nonver- 
bal system of numerical reasoning mediates 
verbally expressed numerical reasoning. It 
seems plausible, for example, that adults be- 
lieve that (2 + 1) > 2 and four minus two 
is less than four because that is the behav- 
ior of the mental magnitudes to which they 
(unconsciously) refer those symbols to en- 
dow them with meaning and reference to 
the world. 

Empiricists offer as an alternative the hy- 
pothesis that adults believe these symbolic 
propositions because they have repeatedly 
observed that the properties of the world 
to which the words or symbols refer behave 
in this way. Adults know, for example, that 
the word two refers to every set that can 
be placed in one-one correspondence with 
some foundational set of two and likewise, 
mutatis mutandis, for the word one, and that 
the phrase plus refers to the uniting of sets, 
and that the phrase greater than refers to the 
relation between a set and its proper subsets, 
and so on. From an empiricist’s perspective, 
the words have these real world references 
only by virtue of the experiences adults have 
had, which are ubiquitous and universal. 

Nativists or rationalists respond that ref- 
erence to the world by verbal expressions is 
mediated by preverbal world-referring sym- 
bolic systems in the mind of the hearer and 
that the ubiquity and universality of the ex- 
periences that are supposed to have created 
world-reference for these expressions are 
grounds for supposing that symbolic systems 
with these properties are part of the innate 
furniture of the mind. We will not pursue 
this old debate further, except to note the 
possible relevance of the experiments previ- 
ously reviewed demonstrating that nonver- 
bal animals reason arithmetically about both 
numerosities (integer quantities) and magni- 
tudes (continuous quantities). 
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erature on numerical competence in very 
young children. It is difficult to demonstrate 
conclusively behavior based on numerosity 
in infants because it is hard not to confound 
variation in one or more continuous quanti- 
ties with variation in numerosity, and infants 
often respond on the basis of continuous di- 
mensions of the stimulus (Clearfield & Mix, 
1999; Feigenson, Carey, & Spelke, 2002; see 
Mix, Huttenlocher, & Levine, 2002, for re- 
view). Nonetheless, there are studies that 
appear to demonstrate sensitivity to numer- 
ical order in infants (Brannon, 2002). More- 
over, the ability of infants to discriminate 
sets on the basis of numerosity extends to 
pairs as large as eight versus sixteen (Lipton 
& Spelke, 2003; Xu & Spelke, 2000). As a 
result, there is reason to suppose that prever- 
bal children share with nonverbal animals a 
nonverbal representation of numerosity. 
The assumption that preverbal children 
represent numerosities by a system of men- 
tal magnitudes homologous to the system 
found in nonverbal animals is the foundation 
of the account of the development of verbal 
numerical competence suggested by Gel- 
man and her collaborators (Gelman & Bren- 
neman, 1994; Gelman & Cordes, 2001; Gel- 
man & Williams, 1998). They argue that the 
development of verbal numerical compe- 
tence begins with learning to count, which is 
guided from the outset by the child’s recog- 
nition that verbal counting is homomorphic 
to nonverbal counting. In nonverbal count- 
ing, the pouring of successive cups into the 
accumulator (the addition of successive unit 
magnitudes to a running sum) creates a one- 
to-one correspondence between the items in 
the enumerated set and a sequence of men- 
tal magnitudes. Although the mental magni- 
tudes thus created have the formal proper- 
ties of real numbers, the process that creates 
them generates a discretely ordered se- 
quence of mental magnitudes, an ordering 
in which each magnitude has a next magni- 
tude. The final magnitude represents the nu- 
merosity of the set. Verbal counting does the 
same thing; it assigns successive words from 
an ordered list to successive items in the set 
being enumerated, with the final word rep- 
resenting the cardinality of the set. 


the principles that govern nonverbal count- 
ing inform the child’s counting behav- 
ior from its inception (Gelman & Gallis- 
tel, 1978). Children recognize that number 
words reference numerosities because they 
implicitly recognize that they are generated 
by a process homomorphic to the nonverbal 
counting of serially considered sets. Num- 
ber words have meaning for the child, as for 
the adult, because it recognizes at an early 
age that they map to the mental magnitudes 
by which the nonverbal mind represents nu- 
merosities. On this account, the child’s mind 
tries to apply from the outset the Gelman 
and Gallistel counting principles (Gelman & 
Gallistel, 1978) - that counting must involve 
a one-one assignment of words to items in 
the set, that the words must be taken from 
a stably ordered list, and that the last word 
represents the cardinality of the set. It takes 
along time to learn the list and to implement 
the verbal counting procedure flawlessly, be- 
cause list learning is hard, because the im- 
plementation of the procedure is challeng- 
ing (Gelman & Greeno, 1989), and because 
the child is often confused about what the 
experimenter wants. 

Critical to Gelman’s account is evidence 
that during the period when they are learn- 
ing to count children already understand 
that the last count word represents a prop- 
erty of the set about which it is appropri- 
ate to reason arithmetically. Without such 
evidence, there is no ground for believing 
that the child has a truly numerical rep- 
resentation. Evidence on this crucial point 
comes from the so-called magic experiments 
(Brannon & Van de Walle, 2001; Bullock & 
Gelman, 1977; Gelman, 1972, 1977, 1993). 
These experiments drew children into a 
game in which a winner and loser plate could 
be distinguished on the basis of the num- 
ber of toy mice they contained. The task en- 
gaged children’s attention and caused them 
to justify their judgments as to whether an 
uncovered plate was or was not the win- 
ner. Children as young as two and a half 
years indicated that the numerosity was the 
decisive dimension, and they spontaneously 
counted to justify their judgment that the 
plate with the correct numerosity was the 
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reptitiously added or subtracted from the 
winner plate during the shuffling, so that it 
had the same numerosity as the loser plate. 
Now, both plates when uncovered were re- 
vealed to be loser plates. In talking about 
what surprised them, children indicated that 
something must have been added or sub- 
tracted, and they counted to justify them- 
selves. This is strong evidence that chil- 
dren as young as two and one half years of 
age understand that counting gives a rep- 
resentation of numerosity about which it 
is appropriate to reason arithmetically. This 
is well before they become good counters 
(Fuson, 1988; Gelman & Gallistel, 1978; 
Hartnett & Gelman, 1998). Surprised two- 
and-half-year-olds made frequent use of 
number words. They used them in idiosyn- 
cratic ways, but ways that nonetheless con- 
formed to the counting principles (Gelman, 
1993), including the cardinality principle. 

A second account of the development 
of counting and numerical understanding 
grows, first, out of the conviction of many 
researchers that, although two-year-olds 
count, albeit badly, they do not understand 
what they are doing (Carey, 2001a, 2001}; 
Fuson, 1988; Mix, Huttenlocher, & Levine 
2002; Wynn, 1990; Wynn, 1992b). It rests, 
secondly, on evidence suggesting that in the 
spontaneous processing of numerosities by 
infants and monkeys, there is a discontinuity 
between numbers of four or less and big- 
ger numbers. In some experiments, the in- 
fant and monkey subjects discriminate all 
numerosity pairs in the range one to four 
but fail to discriminate pairs that include a 
numerosity outside that range (e.g., <3,6>), 
even when, as in the example, their ratio 
is greater than the ratio between discrim- 
inable pairs of four or less (Feigenson, Carey, 
& Hauser, 2002; Uller, et al., 1999; Uller, 
Hauser, & Carey, 2001). 

How to reconcile these latter findings 
with the finding that infants do discriminate 
the pair <8 ,16> (Lipton & Spelke, 2003; Xu 
& Spelke, 2000) is unclear. Similarly, it is 
unclear how to reconcile the monkey find- 
ings with the literature showing the discrim- 
ination of numerosities small and large in 
nonverbal animals. Particularly to be borne 


that monkeys cannot be taught to order nu- 
merosities in other than a numerical order 
(Brannon & Terrace, 2000), even though 
they can be taught to order things other than 
numerosities in an arbitrary, experimenter- 
imposed order (Terrace, Son, & Brannon, 
2003). This implies that numerical order is 
spontaneously salient to a monkey. 

The account offered by Carey (Carey, 
2001a, 2001b) begins with the assumption 
that convincing cases of infant number dis- 
crimination involving numbers less than four 
may depend on the object tracking system. 
In Wynn’s (1992a) experiment, for exam- 
ple, the infants saw an object appear to join 
or leave one or two objects behind an oc- 
cluding screen. They were surprised when 
the screen was removed to reveal a number 
of objects different from the number that 
ought to have been there. This surprise may 
have arisen only from the infant’s belief in 
object permanence. 

When an infant sees an object move be- 
hind an occluding screen, the subsequent re- 
moval of which fails to reveal an object, the 
infant is surprised (Baillargeon, 1995; Bail- 
largeon, Spelke, & Wasserman, 1985). The 
child’s surprise presumably is mediated by 
a system for tracking objects, such as the 
object file system suggested by Kahneman, 
Treisman, and Gibbs (1992) or the FINST 
system suggested by Pylyshyn and Storm 
(1988). This system maintains a marker (ob- 
ject file or FINST) for each object it is track- 
ing, but it can only track about four objects 
(Scholl & Pylyshyn, 1999). As a result, in- 
fants in experiments like Wynn’s are sur- 
prised for the same reason as in original 
object-permanence experiments: An object 
is missing. The infant has an active mental 
marker or pointer that no longer points to 
an object. Alternatively, there is an object 
for which it has no marker. 

Carey argues that sets of object files are 
the foundations on which the understanding 
of integers rests. The initial meaning of the 
words one, two, three, and four does not come 
from the corresponding mental magnitudes; 
rather, it comes from sets of object files. The 
child comes to recognize the ordering of the 
referents of one, two, three, and four because 
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subset a set of one object file, and so on. The 
child comes to recognize that addition ap- 
plies to the things referred to by these words 
because the union of two sets of object files 
yields another set of object files (provided 
the union does not create a set greater than 
four). This is the foundation of the child’s 
belief in the successor principle: Every inte- 
ger has a unique successor. 

This account seems to ignore the basic 
function of a set of, for example, two object 
files (FINSTs, pointers), which is to point 
to two particular objects. If two referred to 
a particular set of two object files, it pre- 
sumably would be usable only in connection 
with the two objects it pointed to. It would 
be a name for that pair of objects, not for 
all sets that share with that set the property 
of twoness. 

A particular set of pointers cannot sub- 
stitute for (is not equal to) another such set 
without loss of function, because its function 
is to point to one pair of objects, whereas 
the function of another such set is to point 
to a different pair. There is no reason to be- 
lieve that there is any such thing as a gen- 
eral set of two pointers — a set that does 
not point to any particular set of two ob- 
jects, but represents all the sets that do so 
point. Any set of two object files is an in- 
stance of a set with the twoness property (a 
token of twoness), but it can no more repre- 
sent twoness than a name that picks out one 
particular dog (e.g., Rover) can represent the 
concept of a dog. A precondition of Rover's 
serving the latter function is that it not serve 
the former. By contrast, any instance of the 
numeral 2 can be substituted for any other 
without loss of function, and so can a pair of 
hash marks. 

A second problem with this account is 
that it is unclear how a system so lacking 
in closure could be the basis for inferring a 
system, the function of which depends so 
strongly on closure. The Carey suggestion is 
motivated by findings that the maximum nu- 
merosity of a set of active object files is at 
most four. There are only nine numerically 
distinct unordered pairs of sets of four or 
less (<1,1>, <1,2>, <1,3>, <14>, <2,2>, 


of the nine pairs, when composed (united) 
yield a set too numerous to be a set of ob- 
ject files. From this foundation, the mind of 
the child is said to infer that the numbers 
may be extended indefinitely by addition. 
One wants to know what the inference rule 
is that ignores the many negative instances 
in the base data set. 


Conclusions and Future Directions 


There is a widespread consensus, backed by 
a large and diverse experimental literature, 
that adult humans share with nonverbal ani- 
mals a nonverbal system for representing dis- 
crete and continuous quantity that has the 
formal properties of continuous magnitudes. 
Mental magnitudes represent quantities in 
the same sense that, given a proper mea- 
surement scheme, real numbers represent 
line lengths. That is, the brains of nonverbal 
animals perform arithmetic operations with 
mental magnitudes; they add, subtract, mul- 
tiply, divide, and order them. The processes 
or mechanisms that map numerosities (dis- 
crete quantities) and magnitudes (continu- 
ous quantities) into mental magnitudes, and 
the operations that the brain performs on 
them, are together such that the results of 
the operations are approximately valid, al- 
beit imprecise; the results of computations 
on mental magnitudes map appropriately 
back onto the world of discrete and contin- 
uous quantity. 

Scalar variability is a signature of the men- 
tal magnitude system. Scalar variability and 
Weber’s law are different sides of the same 
coin: Models that generate scalar variabil- 
ity also yield Weber’s law. There are two 
such models. One assumes that the mapping 
from objective quantity to subjective quan- 
tity (mental magnitude) is logarithmic; the 
other assumes that it is scalar. Both assume 
noise. That is, they assume that the signal 
corresponding to a given objective quantity 
varies from occasion to occasion in a man- 
ner described by a Gaussian probability den- 
sity function. The variation is on the order 
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speeded number estimation. 

The first model (logarithmic mapping) as- 
sumes that scalar behavioral variability re- 
flects a constant level of noise in the sig- 
nal distributions. This yields proportional 
(scalar) variability, because constant log- 
arithmic intervals correspond to constant 
proportions in the corresponding nonlog- 
arithmic magnitudes. The second model 
(scalar mapping) assumes scalar variability 
in the underlying signal distributions. The 
overlap in the two signal distributions is a 
function only of the ratio between the rep- 
resented numerosities in both models, which 
is why they both predict Weber’s law. 

Both models assume there is only one 
mapping from objective quantities to sub- 
jective quantities (mental magnitudes), but 
there is no compelling reason to accept this 
assumption. The question of the quantita- 
tive form of the mapping makes sense only 
at the point at which the mental magni- 
tudes enter into combinatorial operations. 
The form may differ for different combina- 
torial operations. In the future, the analysis 
of variability in the answers from nonverbal 
arithmetic may decide between the models. 
An important component of future models, 
therefore, must be the specification of how 
variability propagates from the operands to 
the answers. 

The system of mental magnitudes plays 
many important roles in verbalized adult 
number behavior. For example, it mediates 
judgments of numerical order and the re- 
trieval of the verbal number facts (addition 
and multiplication tables) upon which ver- 
balized and written calculation procedures 
depend. It also mediates the finding of num- 
ber words to represent large numerosities, 
presented too briefly to be verbally counted, 
and, more controversially, the rapid retrieval 
of number words to represent numerosities 
in the subitizing range (one through six). 

Any account of the development of verbal 
numerical competence must explain how 
subjects learn the bidirectional mapping be- 
tween number words and mental magni- 
tudes, without which mental magnitudes 
could not play the roles just described. One 


merical competence assumes that it is di- 
rected from the outset by the mental magni- 
tude system. The homomorphism between 
serial nonverbal counting and verbal count- 
ing is what causes the child to appreciate the 
enumerative function of the count words. 
The child attends to these words because of 
the homomorphism. Learning their mean- 
ing is the process of learning their mapping 
to the mental magnitudes. Another account 
assumes that the count words from one to 
four are initially understood to refer to sets 
of object files — mental pointers that pick 
out particular objects. On this account, the 
learning of the mapping to mental magni- 
tudes comes later, after the child has exten- 
sive counting experience. 
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Notes 


1. Technically, not really true, because Cantor 
discovered a way to assign a unique positive 
integer to every rational number. The integers 
his procedure assigns, however, are useless for 
computational purposes. 

2. Fortran and C programmers, who have made 
the mistake of dividing an integer variable by 
a floating point variable will know whereof 
we speak. 

3. The magnitude of a pair of numbers is the 
square root of the sum of their squares. 
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CHAPTER 24 


Effects of Aging on Reasoning 


Timothy A. Salthouse 


This chapter reviews empirical research on 
adult age differences in reasoning. It is im- 
portant to begin with three disclaimers, 
however. First, although many types of rea- 
soning have been identified (e.g., deductive, 
inductive, analogical, and visuospatial; see 
articles in this volume by Evans, Chap. 8; 
Sloman & Lagnado, Chap. 5; Buehner & 
Cheng, Chap. 7; Holyoak, Chap. 6; and 
Tversky, Chap. 10), few age-comparative 
studies have included more than two or 
three different reasoning variables and, as a 
result, there is little evidence for distinctions 
among various types of reasoning in stud- 
ies of aging. Different reasoning tasks there- 
fore are considered together in this chap- 
ter, although it is recognized that combining 
them in this manner may be obscuring po- 
tentially important distinctions. Second, the 
discussion is limited to reasoning tasks with 
minimal involvement of knowledge. Because 
knowledge is likely relevant in most every- 
day reasoning, the tasks discussed may refer 
to only a subset of real-life reasoning. The 
third disclaimer is that most of the discus- 
sion refers to research derived from my lab- 
oratory. This obviously represents only a por- 


tion of the relevant literature, but limitations 
of space preclude comprehensive coverage 
of all of the research related to the topic of 
aging and reasoning. A more inclusive review 
of the earlier literature on this topic can be 
found in Salthouse (1992a). 

Some of the most convincing data on 
the relations between age and reasoning are 
those derived from standardized tests be- 
cause the variables were designed to opti- 
mize psychometric properties such as sensi- 
tivity, reliability, and construct validity, and 
the normative samples have typically been 
moderately large and selected to be rep- 
resentative of the general population (see 
Sternberg, Chap. 31, for discussion of intelli- 
gence tests). Three recent cognitive test bat- 
teries have each included at least two mea- 
sures of reasoning. The tests included in the 
Kaufman Adult Intelligence Test (Kaufman 
& Kaufman, 1993) were described on page 6 
of the test manual in the following manner: 
Logical Steps — “Examinees attend to logical 
premises presented both visually and aurally, 
and then respond to a question making use of 
the logical premises;” and Mystery Codes — 
“Examinees study the identifying codes 
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Figure 24.1. Relations of reasoning performance to age in variables from 
standardized tests. Sample sizes were 1,350 for the Kaufman Adolescent and 
Adult Intelligence Test (KAIT), 2,050 for the Wechsler Adult Intelligence Scale 
(WAIS) HI, and 2,505 for the Woodcock Johnson (WJ) III. 


associated with a set of pictorial stimuli 
and then figure out the code for a novel 
pictorial stimulus.” Two reasoning tests in- 
cluded in the latest version of the Wechsler 
test battery, the Wechsler Adult Intelligence 
Scale III (Wechsler, 1997) were described in 
Table 24.1 of the Administration and Scoring 
Manual as follows: Similarities — “A series of 
orally presented pairs for which the exam- 
inee explains the similarity of the common 
objects or concepts they represent;” and Ma- 
trix Reasoning — “A series of incomplete grid- 
ded patterns that the examinee completes 
by pointing to or saying the number of the 
correct response from five possible choices.” 
Finally, two reasoning tests included in 
the Woodcock-Johnson [II (Woodcock, 
McGrew, & Mather, 2001) battery were de- 
scribed in Table 4.2 of the Examiner’s Man- 
ual as follows: Concept Formation — “Identi- 
fying, categorizing, and determining rules;” 
and Analysis—-Synthesis — “Analyzing puzzles 
(using symbolic formulations) to determine 
missing components.” 

To allow across-variable comparisons, the 
variables must be converted into the same 


scale, and a convenient scale for this purpose 
is standard deviation units. (These particu- 
lar variables could have been expressed in 
units of percentage correct, but that scale is 
not as widely applicable because, for exam- 
ple, it is not meaningful when the variables 
are measured in units of time.) The manuals 
for these tests did not present the norma- 
tive data in a form that would allow con- 
version of the scores to standard deviation 
units of the total sample. However, it was 
possible to express the scores in standard de- 
viations of a young adult group, which has 
the advantage that the magnitude of the age- 
related effect can be expressed relative to the 
peak level of performance achieved across all 
ages. Age relations in the six reasoning tests 
just described therefore are portrayed in Fig- 
ure 24.1 in standard deviation units of a ref- 
erence group of young adults. 

Examination of the figure reveals that all 
of the variables exhibit the same trend of 
lower performance with increased age. In 
particular, for most of the variables, the av- 
erage seventy-year-old is performing about 
one standard deviation below the average 
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13 —15 — 20 — 28-39 - 2??? 


Jason and Jessica are planning a dinner party and have 
invited six guests: Mark and Meredith, Christopher and 
Courtney, and Shawn and Samantha. Their table seats 

three people on each side and one at each end. In planning 
the seating arrangements they need to: have Jason and 
Jessica sit at opposite ends of the table; place Christopher 

at a corner with no one on his left; not have Mark seated 

next to Samantha; and have Courtney seated next to Meredith. 


Which of the following is an acceptable arrangement of 
diners along one side of the table? 

7 Jason, Samantha, Mark 

. Christopher, Jessica, Shawn 

° Mark, Courtney, Samantha 

. Meredith, Shawn Courtney 

° Shawn,Christopher,Meredith 


Integrative Reasoning 


F and G do the SAME 

E and F do the OPPOSITE 

G and H do the OPPOSITE 

If E increases will H decrease? 


Figure 24.2. Examples of problems in four different reasoning tasks used in studies by Salthouse and 


colleagues. See text for details. 


of the young adults. The age trends are not 
completely uniform because the age effects 
appear to be later and smaller for some 
variables (e.g., Similarities) than for other 
variables (e.g., Analysis-Synthesis). How- 
ever, it is important to note that there is 
also considerable across-sample variation be- 
cause the age gradients are shallower for 
both Wechsler subtests (i.e., Similarities and 
Matrix Reasoning) than for the subtests from 
the other batteries. 

Relations between age and measures of 
reasoning can also be illustrated with four 
reasoning tasks used in several studies in my 
laboratory. Examples of problems in each 
type of task are portrayed in Figure 24.2. 
In matrix reasoning tasks (such as Raven’s 
Progressive Matrices, Raven, 1962), the ex- 
aminee attempts to select the best comple- 
tion of the missing cell from the alternatives 
presented below the matrix. The goal in se- 
ries completion tasks (such as the Shipley 
Abstraction Test, Zachary, 1986) is to deter- 


mine the item that provides the best contin- 
uation of the sequence of items. In analyt- 
ical reasoning tasks, the examinee uses the 
presented information to determine which 
of several alternatives best satisfies the spec- 
ified constraints. Finally, examinees in inte- 
grative reasoning tasks use the information 
in the premises to answer a question about 
the relation between two of the variables. 
Although no formal evidence is available, 
it seems likely that these four tests repre- 
sent somewhat different types of reasoning, 
and they certainly involve different require- 
ments and types of material. 

Because the tasks were each administered 
in two or more studies from my laboratory, 
the data have been combined across stud- 
ies. The research participants in the studies 
were all similar in that they ranged from 18 
to over 80 years of age, had an average of 
between 14 and 17 years of education, and 
generally reported themselves to be in good 
to excellent health. 
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Young Adult SD Units 


—@— Matrix Reasoning (N = 1976) 
—O— Series Completion (N = 1283) 
—v— Analytical Reasoning (N = 1160) 
—v— Integrative Reasoning (N = 985) 


20 40 


60 80 


Chronological Age 


Figure 24.3. Means and standard errors of performance in four different reasoning tasks as a function 
of age. Data from various studies by Salthouse and colleagues. 


Age relations in these four tasks are por- 
trayed in Figure 24.3 in the same format used 
to display results of the tests from the psy- 
chometric test batteries. It can be seen that 
the pattern with these data closely resembles 
that from the normative samples in the stan- 
dardized test batteries. In particular, there 
is an approximately linear decline in perfor- 
mance with increased age, such that the av- 
erage at age seventy is about one standard 
deviation below the average of the reference 
group of young adults. 

The age relations for three of the vari- 
ables in Figure 24.3 were nearly identi- 
cal, but the age function was shallower for 
the series completion variable. This may be 
because several items in the Shipley Ab- 
straction series completion test (from which 
these data were derived) have considerable 
reliance on verbal knowledge, which tends 
to be relatively well preserved across this 
age range. For example, some of the items 
in that test involve determining relations 


among letters in reverse alphabetical se- 
quence, or among words with particular 
semantic relations. Additional support for 
this differential-involvement-of-knowledge 
interpretation of the different age trends is 
provided by the correlations of the reasoning 
variables with a composite vocabulary vari- 
able, as the correlations were 0.37 for ma- 
trix reasoning, 0.23 for analytical reasoning, 
0.24 for integrative reasoning, and 0.66 for 
series completion. 

Although not apparent in Figures 24.1 
and 24.3, other results indicate that the age 
relations on variables assessing reasoning are 
as large or, in some cases, even larger than 
the age relations on other types of cogni- 
tive variables. For example, Verhaeghen and 
Salthouse (1997) reported a meta-analysis 
in which the weighted correlation (based on 
9,342 individuals across thirty-eight studies) 
between age and measures of reason- 
ing was —.40, and the weighted corre- 
lation (based on 5,871 individuals across 
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sures of episodic memory was —0.33. Fur 
thermore, in analyses to be described later, 
the correlations between age and factor 
scores were very similar for factors based 
on memory (r = —o.48) and reasoning (r = 
—o.49) variables. 

Despite their similar magnitude, age dif- 
ferences in reasoning are not as widely rec- 
ognized as age differences in memory. A 
possible reason may be that considerable 
knowledge is required in many everyday sit- 
uations that involve reasoning, such that any 
age effects might not be noticed either be- 
cause of a large positive relation between age 
and knowledge, or because any deficiencies 
are attributed to lack of relevant knowledge 
instead of to problems of reasoning. 

The primary question in light of age 
differences such as those apparent in Fig- 
ures 24.1 and 24.3 is, “What is responsible 
for the large negative relations between age 
and performance on measures of reasoning?” 
Much of the research that has been con- 
ducted to address this question can be clas- 
sified into one of two broad categories. One 
category consists of investigations of the in- 
fluence of factors such as comprehension, 
speed, strategy, and working memory on 
the age differences in the performance of a 
particular reasoning task. The second cate- 
gory of research has involved examining age- 
related effects on measures of reasoning in 
the context of age-related effects on other 
cognitive abilities. In the remaining sections 
of this chapter, the two approaches are illus- 
trated with research from my laboratory. 


Process-Oriented Research 


The majority of the empirical research in the 
area of cognitive aging has focused on a sin- 
gle cognitive variable (with different stud- 
ies concentrating on different variables), and 
has attempted to determine the relative con- 
tribution of different processes to the age dif- 
ferences on that particular variable. Among 
the potential determinants of age differences 
in reasoning variables that have been inves- 


speed, strategy, and working memory. Em- 
pirical research relevant to each of these po- 
tential determinants is briefly summarized in 
this section. 


Comprehension 


It is conceivable that at least some of the 
age differences in reasoning are simply 
attributable to greater difficulties associated 
with increased age in understanding exactly 
what is required to perform the task suc- 
cessfully. This is an important possibility to 
consider because age differences in reasoning 
would probably not be of much theoretical 
interest if they merely reflected comprehen- 
sion problems. 

The primary means by which the com- 
prehension interpretation has been inves- 
tigated restricted comparisons to individu- 
als for whom there is evidence that they 
understood the task requirements. For ex- 
ample, participants with accuracy less than 
some criterion value have been excluded 
from the analyses in integrative reasoning 
(Salthouse, 1992b, 1992c) and matrix rea- 
soning (Salthouse & Skovronek, 1992) tasks, 
and analyses have been restricted to partici- 
pants with correct responses on the first two 
items in the matrix reasoning (Salthouse, 
1993) task. In each of these cases, strong neg- 
ative age relations were evident among the 
participants who understood the tasks well 
enough to answer several problems correctly. 
These results therefore suggest that age dif- 
ferences in simple comprehension proba- 
bly are not responsible for much, if any, 
of the age differences observed in measures 
of reasoning. 


Speed 


Another relatively uninteresting possibility 
is that age differences in measures of rea- 
soning might merely reflect a slower rate 
of reading or of responding, without any 
detrimental effects on the quality of perfor- 
mance. Because effects of age-related slow- 
ing have been extensively documented (e.g., 
Salthouse, 1996a), it is important to con- 
sider whether age differences in reasoning 
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processes associated with encoding or re- 
sponding to the information. 

One way in which the role of slower rates 
of input or output has been investigated in- 
volves examining age relations on reason- 
ing tasks administered under untimed, or 
self-paced, conditions. Most of the com- 
parisons have revealed significant age dif- 
ferences even when the participants are 
allowed to control the duration of the stim- 
ulus presentation, and take as long as they 
want to respond. Age differences in deci- 
sion accuracy under these conditions have 
been found in geometric analogies (Salt- 
house, 1987), series completion (Salthouse 
& Prill, i987), matrix reasoning (Salthouse, 
1993; 1994; Salthouse & Skovronek, 1992), 
and integrative reasoning (Salthouse, 1992¢; 
Salthouse et al., 1989; 1990) tasks, and in 
the Wisconsin Card Sorting Test (WCST; 
Salthouse et al., 1996; Fristoe, Salthouse, & 
Woodard, 1997; Salthouse et al., 2003). 

The role of speed on age differences in 
matrix reasoning was examined more closely 
in two studies by Salthouse (1994) by ob- 
taining separate measures of study time, de- 
cision time, and decision accuracy from each 
participant. Not only were significant age 
differences found on each measure, but anal- 
yses revealed that some of the age-related ef- 
fects on the decision accuracy measure were 
statistically independent of the age-related 
effects on the study time and decision time 
measures. At least in this project, therefore, 
older adults took longer than younger adults 
to work on the problems and to communi- 
cate their decisions, and their decisions were 
less accurate. 

A second method of investigating the role 
of limited time on age differences in reason- 
ing involves examining age differences in the 
percentage of items answered correctly only 
for attempted items, as inferred by the pres- 
ence of an overt response. Strong negative 
age relations have been found even when 
only attempted items were considered in in- 
tegrative reasoning (Salthouse, 1992b), ge- 
ometric analogies (Salthouse, 1992b), and 
matrix reasoning (Salthouse, 1991; 1993; 


items in matrix reasoning and analytical rea- 
soning tests that were attempted by every- 
one (Salthouse, 2000, 2001). 

Taken in combination, the results just de- 
scribed suggest that adult age differences 
in reasoning are not simply attributable to 
slower rates of reading or responding. The 
speed of internal mental operations may be 
a factor in some of the performance differ- 
ences (see Salthouse, 1996a), but because 
sizable age differences in accuracy are found 
when there are no external time constraints, 
the differences do not appear to be solely the 
result of slower rates of input or output. 


Strategy 


One of the most popular interpretations of 
age differences in cognitive functioning, at 
least in part because it implies that the age 
differences might be amenable to interven- 
tion, attributes them to the use of different 
strategies by adults of different ages. It is im- 
portant to consider two issues when evalu- 
ating this distinction: whether or not adults 
of different ages actually do use different 
strategies when performing the task and, if 
so, what is responsible for those differences. 

Information about the existence of pos- 
sible strategy differences has been obtained 
by examining the distribution of study times 
across different parts of the reasoning prob- 
lem. For example, the research participant 
could be instructed to press a key to view 
each element of the problem, and then the 
time between successive keystrokes could be 
recorded to determine the time devoted to 
inspecting or studying each element. Vari- 
ants of this method have been used in anum- 
ber of reasoning tasks with comparable out- 
comes. Specifically, the relative distribution 
of inspection or study times has been found 
to be similar in young and old adults in se- 
ries completion (Salthouse & Prill, 1987), in- 
tegrative reasoning (Salthouse et al., 1990), 
and geometric analogies (Salthouse 1987). 
To the extent that relative time allocation 
across different elements of a problem can 
be considered as evidence of a particular 
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young and old adults were using a similar 
strategy. 

Additional evidence relevant to the strat- 
egy interpretation of age-related differences 
in reasoning is based on an examination of 
possible age differences in the pattern of in- 
correct alternatives selected when choosing 
a response. The rationale is that adults of dif- 
ferent ages might be expected to differ in 
the frequency of selecting particular incor- 
rect alternatives if they were relying on dif- 
ferent rules or strategies to select their an- 
swers. However, no age differences in the 
relative percentages of different types of er- 
rors in a matrix reasoning task were found 
by Salthouse (1993; also see Babcock, 2002), 
which suggests that adults of different ages 
were probably using the same strategies but 
that the effectiveness of the strategies was 
lower with increased age. 

Finally, a study by Fristoe, Salthouse, & 
Woodard (1997) was designed to investigate 
the manner in which young and old adults 
performed the WCST. The WCST is a con- 
cept identification test in which the stim- 
uli consist of cards that vary in the number, 
color, and shape of objects. An unusual fea- 
ture of the test is that the rule (i.e, num- 
ber, color, or shape) used to determine the 
correct sorting of the cards changes after ev- 
ery 10 correct sorts without informing the 
participant. The participants in the Fristoe, 
Salthouse, & Woodard (i997) study were 
asked to indicate the dimension that they 
were using in making their decisions about 
how to sort stimulus cards. By combining 
this information with the responses selected 
and the feedback received after each re- 
sponse, it was possible to determine the per- 
centage of times each participant maintained 
the same hypothesis after receiving positive 
feedback (i.e., “win-stay”), and the percent- 
age of times he or she changed hypothe- 
ses after receiving negative feedback (i.e., 
“lose-shift’”). 

Optimal performance in this type of 
feedback-based concept identification situ- 
ation would be manifested in high percent- 
ages of “win-stay” and “lose-shift” behavior. 


Aag"COrGompared with young adults, older adults 


had lower percentages of both types of be- 
havior, and statistical control of a compos- 
ite measure of feedback usage reduced the 
age-related variance in a measure of WCST 
performance by 74%. These results clearly 
indicate that the young and old adults in this 
study performed the task in a somewhat dif- 
ferent fashion and that the difference was 
related to success in the task. However, be- 
cause there was no evidence that the older 
adults were as capable as the young adults of 
performing in the same optimal manner, it 
is questionable whether the differences ob- 
served in the way the task was performed 
should be considered evidence for differ- 
ences in strategy, which has a voluntary or 
optional connotation. 

Although only a limited amount of rele- 
vant evidence is currently available, it does 
not appear that much, if any, of the age- 
related differences in reasoning can be ex- 
plained by differences in the strategies used 
to perform the task. Furthermore, it is im- 
portant to recognize that, even if evidence 
of strategy differences were available, inter- 
pretations based on strategy differences are 
likely to be somewhat ambiguous unless an 
explanation is also provided for why peo- 
ple of different ages used different strate- 
gies. That is, if strategy differences were to 
be found, a critical question is whether the 
most effective or optimal strategy is still fea- 
sible for older adults but not used for some 
reason, or whether older adults are less able 
to use the more powerful or optimal strategy 
than young adults. As a result, a difference 
in strategy might be viewed merely as a dif- 
ferent level of description, such that if age 
differences were to be found, they would 
still need to be explained, just as would 
age differences in measures of overall task 
performance. 


Working Memory 


An interpretation that has generated consid- 
erable interest, particularly since a provoca- 
tive article by Kyllonen and Christal (1990) 
that reported a very strong relation between 
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measures of reasoning (see Morrison, Chap. 
19), is that at least some of the age- 
related differences in reasoning might be 
attributable to age differences in WM. 
Because WM has been defined as the abil- 
ity to preserve information while processing 
the same or other information, and because 
many reasoning tasks require that informa- 
tion be maintained in order for it to be oper- 
ated upon, interpreting the age differences 
in reasoning as a function of WM has con- 
siderable intuitive plausibility. 

One method used to investigate the 
role of WM in reasoning involves manip- 
ulation of the number of premises pre- 
sented in integrative reasoning problems. 
The rationale that increasing the number of 
premises would increase the WM require- 
ments, which might then be expected to 
increase the magnitude of the age differ 
ences in reasoning performance if at least 
some of those differences are attributable 
to WM limitations. Support for this expec- 
tation was provided in four independent 
studies (Salthouse, 1992b, 1992c; Salthouse 
et al., 1989; Salthouse et al., 1990). In each 
study, reasoning accuracy decreased as the 
number of premises increased, and the mag- 
nitude of this decrease was greater for older 
adults than for young adults. 

Another manipulation incorporated in 
several integrative reasoning studies in- 
volved the presentation of trials in which 
only one of the premises was relevant to the 
decision. Consider the problem portrayed in 
the lower right panel of Figure 24.2, for ex- 
ample. In the version displayed, all of the 
premises are relevant to the decision and 
would need to be considered to reach a 
valid conclusion. If, instead of referring to 
variables E and H, the question referred to 
variables E and F, however, all of the infor- 
mation relevant to the decision would have 
been presented in a single premise. These 
“one-relevant” trials are interesting because 
no across-premise integration of informa- 
tion is required for a correct decision, and 
the major determinant of quality of per 
formance therefore is presumably the abil- 
ity to maintain the relevant information in 


ular studies, the task was administered on a 
computer and only one premise was visible 
at a time.) 

A consistent finding in each of these stud- 
ies (i.e., Salthouse, 1992c; Salthouse et al., 
1989; Salthouse et al., 1990) was that the re- 
lation of accuracy to the number of premises 
was nearly identical when only one premise 
was relevant and when two or more premises 
were relevant. Furthermore, this pattern was 
similar across adults of all ages. These re- 
sults therefore suggest that the primary rea- 
son why accuracy was lower when the prob- 
lems contained more premises was related to 
the availability of information and not to dif- 
ficulties in integrating relevant information. 
The fact that the pattern was similar in adults 
of all ages further implies that the age dif- 
ferences in this task are largely attributable 
to differences in the availability of relevant 
information. 

An _ additional expectation from the 
information-availability interpretation is 
that age differences should be evident in 
the shape of the serial position functions 
relating decision accuracy to sequential 
position of the relevant premise. In fact, 
Salthouse et al. (1990) did find that young 
adults exhibited a classical serial position 
function, with higher accuracy for the more 
recent premises, whereas the function for 
older adults was flat. However, for reasons 
that are not yet clear, this pattern was not 
replicated in a later study by Salthouse 
(19920). 

Manipulation of the number of problem 
elements has also been examined in geomet- 
ric analogy and matrix reasoning tasks, with 
somewhat different patterns of results. To 
illustrate, three studies found that age dif- 
ferences in measures of decision time, de- 
cision accuracy, or both, were larger when 
there were more relevant elements in geo- 
metric analogy problems (Salthouse, 1987, 
1988, 1992Cc). In several studies reported by 
Salthouse (1993) and in a study by Salthouse 
(1994), however, age differences in a matrix 
reasoning task were nearly constant across 
increases in the number of relations among 
elements, and in none of these studies was 
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and number of relations in the problem. Spe- 
cific characteristics of the tasks may be re- 
sponsible for the different patterns of re- 
sults across integrative reasoning, geometric 
analogy, and matrix reasoning tasks, but the 
exact nature of those characteristics is not 
yet known. 

Another method used to investigate the 
role of WM in reasoning involves assess- 
ing on-line availability of information dur 
ing the performance of the task. For ex- 
ample, Salthouse (1993) and Salthouse and 
Skovronek (1992) presented a successive 
version of the matrix reasoning task in which 
each matrix cell was numbered. To view a 
cell in the matrix, the participant had to type 
the corresponding number. In three sepa- 
rate studies, older adults were found to ex- 
amine the same cell more frequently than 
young adults, as though the information in- 
spected earlier was no longer functionally 
available to them. Furthermore, when pre- 
sented with probes of information examined 
earlier, older adults were less accurate than 
young adults in recognizing the contents of 
previously viewed cells (Salthouse, 1993). 

A final piece of evidence relevant to the 
WM interpretation of age differences in rea- 
soning is that Salthouse (1992c) found a 
qualitatively similar pattern of differences 
between young and old adults, and between 
young adults with and without a concur 
rent memory load (of five random digits). 
To the extent that a concurrent mem- 
ory load is viewed as simulating reduced 
WM capacity, this finding is consistent with 
the hypothesis that at least some of the 
age differences in the integrative reason- 
ing task are attributable to age differences 
in WM. 

In summary, results from a number of dif- 
ferent types of comparisons in a variety of 
reasoning tasks lend credibility to the inter 
pretation that the ability to maintain rele- 
vant information during the performance of 
reasoning tasks likely contributes to at least 
some of the adult age differences in reason- 
ing. Although the available evidence sug- 
gests that working memory is probably in- 
volved in the age differences in reasoning, 


the role of other factors in the age differ- 
ences, remain to be determined. 


Correlational Analyses 


The second major approach to investigat- 
ing adult age differences in cognition has re- 
lied upon correlational data to attempt to 
specify the number and nature of statisti- 
cally distinct age-related influences operat- 
ing on different types of cognitive variables. 
In this section, results relevant to under 
standing effects of aging on reasoning based 
on mediational, componential, correlated- 
factors, and hierarchical structure models are 


described briefly. 


Mediational Models 


The goal of mediational models is to exam- 
ine the role of one or more constructs as 
potential mediators of the age differences 
in measures of reasoning by means of sta- 
tistical adjustment. The rationale is that if 
age-related effects on variable Y are at least 
partially attributable to age-related effects 
on variable X, then statistical control of X 
should reduce the magnitude of the age- 
related effects on Y. For the purpose of these 
analyses, X could be a measure of any factor 
hypothesized to be important in the target 
variable, Y. Most of the mediational models 
applied to reasoning have used measures of 
WM in the role of X because of the assump- 
tion that reasoning tasks frequently require 
that earlier information be preserved when 
processing later information, and individuals 
who are better able to do that, as reflected by 
higher scores on WM tasks, therefore would 
be expected to perform at higher levels on 
reasoning tasks. 

Several studies in my laboratory have re- 
lied upon two tasks to assess WM. Both 
require participants to remember informa- 
tion while simultaneously processing other 
information. In the computation span task, 
for example, arithmetic problems had to be 
answered while remembering the last digit 
in each problem, and in the listening span 
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answered while remembering the last word 
in each sentence. Measures of performance 
in these tasks have been found to exhibit 
good reliability, and to be negatively corre- 
lated with age. 

Three sets of results are necessary to 
establish the plausibility of a mediational 
interpretation of age-related differences in 
reasoning. The first is the demonstration of 
age-related differences in the expected di- 
rection in measures of the hypothesized me- 
diator, because a construct cannot mediate 
age differences in other variables or con- 
structs if it is not related to age. The second 
necessary result is the existence of a moder- 
ate relation between the hypothesized medi- 
ator and the target variable it is presumed to 
explain, because no mediation is possible if 
the suspected mediator and the target vari- 
able are not related to one another. Third, 
age-related differences in the target variable 
should be reduced after statistical control of 
the mediator, with the magnitude of the re- 
duction serving as an approximate index of 
the degree of mediation. This last result is 
critical because mediation is not plausible 
if the relations of age to the target variable 
are not at least moderately attenuated when 
the variability in the hypothesized mediator 
is eliminated. 

A variety of procedures can be used to 
statistically control the hypothesized medi- 
ator, such as partial correlation, semipartial 
correlation (available from hierarchical re- 
gression), analysis of covariance, and so on. 
In each case, the goal is to eliminate the 
variance in the target variable that is re- 
lated to the mediator such that relations be- 
tween age and the target variable can be 
examined when differences in the level of 
the mediator no longer influence the target 
variable. 

The most relevant comparisons from me- 
diational analyses of WM on reasoning are 
those between the initial age relation on the 
reasoning variable and the age—reasoning re- 
lation after statistical control of the WM 
measure. A consistent finding across several 
different types of reasoning tasks has been a 
substantial reduction in the age-related vari- 


reductions of 57% (Salthouse et al., 1989), 
88% (Salthouse, 1992b; also see Salthouse, 
1991), and 48% (Salthouse, 1992c) in inte- 
grative reasoning tasks, 65% in a geomet- 
ric analogies task (Salthouse, 1992b), and 
43% to 84% in matrix reasoning tasks (Salt- 
house, 1993). Similar findings have been 
reported by other researchers with matrix 
reasoning (Babcock, 1994) and syllogistic 
reasoning (Fisk & Sharp, 2002; Gilinsky & 
Judd, 1994) tasks. Sizable reductions in the 
age-related differences after control of WM 
have been found even with percentage cor- 
rect measures (Salthouse, 1992b) and on the 
accuracy of individual items in a matrix rea- 
soning task (Salthouse, 1993). A significant 
relation of WM on two-premise and three- 
premise integrative reasoning problems also 
has been found after control of the influence 
of one-premise problems (Salthouse, 1992b, 
1996b), which implies that WM specifically 
contributes to the maintenance of informa- 
tion needed in more complex problems. 

This pattern of results clearly is consistent 
with the hypothesized influence of WM on 
age-related differences in reasoning. How- 
ever, it is important to recognize that com- 
parable, and sometimes even larger, reduc- 
tions in the age-related effects in reasoning 
have been found after statistical control of 
other theoretical constructs, such as percep- 
tual speed (e.g., Salthouse, 1991, 1993,1994, 
1996a). Because most cognitive variables are 
positively correlated with one another, some 
attenuation of the age-related effects on one 
cognitive variable likely would be expected 
after statistical control of almost any other 
cognitive variable. A discovery of attenuated 
age-related variance after statistical control 
of a hypothesized mediator therefore should 
be considered only necessary, but not suf- 
ficient, evidence for the validity of media- 
tional hypotheses. 


Componential Models 


Componential models are more complex 
than mediational models because they pos- 
tulate that nearly every cognitive task in- 
volves multiple processes or components, 
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fluenced by the Scenes or effectiveness 
of each component. Componential models 
have been investigated by relying upon the 
pattern of correlations among measures of 
the components and a measure of perfor 
mance on the target reasoning task to de- 
termine the relative contribution of each 
hypothesized component. For example, a re- 
searcher might postulate that components 
A, B, and C are required to perform a partic- 
ular task, administer tasks to obtain variables 
that reflect A, B, and C as directly as possible, 
and then examine correlations among the 
variables based on the reasoning tasks and 
the component tasks. Componential models 
can be applied to research on aging by deter- 
mining the degree to which age-related ef- 
fects on the target reasoning task are altered 
when variability in measures of the compo- 
nents is statistically controlled. 
Componential models of the matrix rea- 
soning and analytical reasoning tasks were 
investigated by Salthouse (2001), and a 
somewhat different componential analysis 
of age differences in matrix reasoning was 
reported by Babcock (1994). Salthouse hy- 
pothesized three components were involved 
in each of the tasks: rule identification, 
rule application, and information integra- 
tion in the matrix reasoning task; and sim- 
ple comprehension, information integration, 
and condition verification in the analytical 
reasoning task. Primarily on the basis of in- 
tuition and judgments of face validity, two 
variables were selected to represent each 
hypothesized component. To illustrate, the 
rule identification component was assessed 
by a Figure Classification test, in which ex- 
aminees determine the basis by which differ- 
ent figures are related to one another, and by 
a Location test, in which examinees deter- 
mine the rule governing the position of a set 
of Xs in each row of a matrix. The rule ap- 
plication component was assessed with two 
tasks (i.e., Pattern Transformation and Geo- 
metric Transformation) in which the exami- 
nee views an initial line pattern or geometric 
figure, carries out a specified transformation 
(such as rotation, subtraction, or addition), 
and then decides whether the transforma- 


ittped i4agieinasignco tion applied to the initial figure would match 


a comparison figure. 

A critical prerequisite for a componen- 
tial analysis is that the pattern of correlations 
and, specifically, the results from a confirma- 
tory factor analysis, should provide evidence 
for distinct constructs. That is, only if there 
is evidence that the variables represent sep- 
arate constructs is it meaningful to examine 
their relative contributions to the age dif- 
ferences in the performance of the criterion 
reasoning task. The results of the two stud- 
ies reported by Salthouse (2001) were not 
consistent with the existence of three sepa- 
rate factors because all of the variables had 
similar correlations with one another. To il- 
lustrate, the correlation between the two- 
rule identification variables was 0.50, and 
their correlations with variables hypothe- 
sized to reflect the rule application compo- 
nent ranged from 0.48 to 0.62. Because there 
was no evidence that the hypothesized com- 
ponents represented distinct dimensions of 
individual differences (i.e., exhibited con- 
struct validity), it was impossible in these 
studies to decompose the age differences in 
the target tasks into discrete components. 

There are at least three possible inter- 
pretations of results such as those just de- 
scribed. First, the theoretical models may 
not have been valid because the designated 
components are not actually required to per 
form the tasks. Second, the models could 
have been valid and the components might 
have been relevant to performance on the 
target task, but the components were not 
accurately assessed with the selected tasks. 
And third, the models may not have been 
valid because the hypothesized components 
do not actually exist as distinct entities. Un- 
fortunately, the available data do not allow 
these alternatives to be distinguished. How- 
ever, it is worth considering whether a simi- 
lar situation may exist in componential mod- 
els of other cognitive tasks but has not been 
recognized because there have seldom been 
any attempts to investigate the construct va- 
lidity of the hypothesized processes or com- 
ponents. Results of the Salthouse (2001) 
project therefore suggest that it is important 
to obtain empirical evidence of the construct 
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investigating their role in cognitive tasks. 


Correlated Factor Models 


The variables included in mediational and 
componential models typically have been se- 
lected because of their presumed relevance 
to the target variable one is trying to ex- 
plain. An alternative approach based on cor- 
relational data would be to consider the 
interrelations among a broad variety of cog- 
nitive variables in terms of some organiza- 
tional structure and then examine relations 
of age to the target variable within the con- 
text of that structure. 

The simplest organizational structure is 
one in which the variables are grouped into 
several first-order factors or abilities, with 
the factors allowed to correlate with one an- 
other. Age-related effects on specific reason- 
ing variables can be investigated in this type 
of correlated-factors structure by determin- 
ing the degree to which the age-related ef- 
fects on the target reasoning variable are di- 
rect or are indirect and operate through one 
or more cognitive abilities. 

The ideal data set for analyses involving 
cognitive abilities would involve a wide va- 
riety of cognitive variables, and as large and 
diverse a sample of participants as possible. 
No single study is likely to possess all of 
these characteristics, but an approximation 
to this ideal can be obtained by aggregating 
data across different studies involving differ- 
ent combinations of variables. Aggregation 
of the data in this way essentially treats the 
individuals as though they were participants 
in a single large study but with missing val- 
ues for the variables that were not collected 
in the particular study in which an individ- 
ual participated. Although data with a large 
proportion of missing values can be compli- 
cated to analyze, meaningful analyses can be 
conducted by relying on an algorithm such 
as the full information maximum likelihood 
procedure (e.g., Enders & Bandalos, 2001) to 
take advantage of all available information. 

A combined data set of this type was cre- 
ated by aggregating data across 33 separate 
studies from my laboratory involving a to- 


included in the aggregate data set are listed 
in Table 24.1 together with the respective 
sample sizes and age correlations. Entries 
in the right-most columns in Table 24.1 are 
the factor loadings from a confirmatory fac- 
tor analysis in which factors corresponding 
to reasoning, spatial visualization, episodic 
memory, perceptual speed, and vocabulary 
abilities were postulated. As expected, the 
loadings of the variables on the factors all 
were high, with only four below o.7, and 
the factors were moderately correlated with 
one another. A second model examined re- 
lations between age and each of the ability 
factors. These (standardized) relations were 
—o.49 for reasoning, —o.41 for space, —0.48 
for episodic memory, 0.63 for speed, and 
0.25 for vocabulary. 

Inspection of the coefficients in the rea- 
soning column reveals that the matrix rea- 
soning and analytical reasoning variables 
both had high loadings on the reasoning fac- 
tor and therefore can be considered proto- 
typical reasoning tasks. The contributions of 
the five abilities to these two variables there- 
fore were examined by modifying the anal- 
ysis to specify relations of each of the five 
abilities to these variables. In effect, these 
analyses are asking what abilities contribute, 
and by how much, to the individual differ- 
ences in performance of these tests. The top 
panel of Table 24.2 summarizes results of 
these analyses, where it can be seen that, 
as expected, the strongest relation of each 
variable was with the reasoning factor. How- 
ever, it is important to note that each vari- 
able also had significant relations with fac- 
tors representing other cognitive abilities. 
Both the matrix reasoning and the analytical 
reasoning variables were positively related 
to spatial visualization ability and negatively 
related to vocabulary ability. This latter re- 
lation is rather puzzling because it suggests 
that, when other relations are taken into con- 
sideration, people with higher levels of vo- 
cabulary tend to perform somewhat worse 
on these reasoning tasks than people with 
lower levels of vocabulary. 

This simple structure can be used to esti- 
mate the indirect effects of age on reasoning 
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Table 24.R Ricsereait hOuutitns thigh halitenatgabcigion Data Aggregated across Multiple Studies 


Factor Loading 

Variable N Age r Rea Spe Mem Spd Voc 

Matrix reasoning 1976 —.50 87 

Analytical reasoning 1160 —.46 .76 

Shipley abstraction 1283 —.29 87 

Integrative reasoning 985 —.35 .62 

Figure classification 458 —.60 74 

Cattell matrices 420 —.48 82 

Letter sets 1179 —.26 80 

Geometric analogies 756 —.36 .78 

PMA reasoning 305 —.41 86 

Grammatical reasoning 229 —.35 80 

Series completion 150 —.37 80 

Analysis synthesis 204 —.36 79 

Power letter series 150 —.47 93 

WCST number of 711 —.28 56 

categories 

Diagramming relations 449 —.40 .76 

Locations 449 —.41 60 

Spatial relations 1154 —.34 Ql 

Paper folding 994 —.43 81 

Form boards 847 —.38 80 

Surface development 639 —.32 72 

PMA space 305 —.39 .76 

Block design 463 —.39 89 

Object assembly 259 —.41 81 

Cube assembly 1272 —.17 .60 

Paired associates 1769 —.38 .72 

Free recall 1764 —.42 84 

Logical memory 793 —.24 72 

Free recall of transfer list 1054 =.35 77 

Digit symbol 2041 —.57 .78 

Letter comparison 6082 —.43 79 

Pattern comparison 6082 —.52 82 

Cross out 204 =.71 .92 

Digit symbol reaction 2417 —.56 77 

time 

WAIS vocabulary 795 13 86 

WJ picture vocabulary 795 130 80 

Antonym vocabulary 3509 18 90 

Synonym vocabulary 3511 ey) 89 

Shipley vocabulary 259 122 93 

Factor correlations 
Reasoning (Rea) - 88 73 79 47 
Space (Spc) - 65 .67 46 
Memory (Mem) - .7O 42 
Speed (Spd) - 28 
Vocabulary (Voc) - 

Notes: N = number, Age r = , Rea = reasoning, Spc = space, Mem = memory, Spd = speed, Voc = vocabulary, 


PMA = Primary Mental Abilities, WCST = Wisconsin Card Sorting Test, WAIS = Wechsler Adult Intelligence 
Scale, WJ = Woodcock—Johnson. 
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Table 24.2. Léackites aefaesboy Ritter deat airediyptahk easoning Variables on Five 


Cognitive Abilities 


Rea Spc Mem Spd Voc 

All 

Matrix reasoning .86* 257 —.06 —.07 —.20* 

Analytical reasoning .76* 25° —.04 —.13 —.17* 
Matrix reasoning 

Under age 50 .97* 18 -.11 —.08 —.20* 

Age 50 and over .79* 30° —.02 —.02 —.21* 
Analytical reasoning 

Under age 50 .gi* .01 —.04 —.04 —.10 

Age 50 and over 50° .46* —.04 —.09 —.17 
*p <.01 


Note: None of the coefficients for the under-age-fifty group and the age-fifty-and-over group was significantly 


different from one another. 


Rea = reasoning, Spc = space, Mem = memory, Spd = speed, Voc = vocabulary. 


variables by incorporating information about 
the relations between age and each ability. 
To illustrate, because the standardized coef- 
ficient for the relation from age to the rea- 
soning ability factor was —o.49, and that 
for the relation between the reasoning fac- 
tor and the matrix reasoning variable was 
0.87, it can be inferred that —0.43 (i.e, 
—o.49 x 0.87) of the total —o.50 age effect 
on matrix reasoning (cf. Table 24.1) is asso- 
ciated with influences through the reasoning 
ability factor. 

The correlated-factors structure can also 
be used to investigate whether the variables 
represent the same constructs to the same 
degree at different ages (i.e, the issue of 
measurement equivalence). The preceding 
analyses therefore were repeated in samples 
of adults under and over the age of 50, with 
the results summarized in the bottom panels 
of Table 24.2. Inspection of the entries indi- 
cates that the pattern of ability relations for 
the matrix reasoning variable was very simi- 
lar in the two age groups, consisting of a large 
positive relation with the reasoning factor, a 
small positive relation with the spatial visu- 
alization factor, and a small negative relation 
with the vocabulary factor. Although the 
pattern appears somewhat different across 
the two age groups for the analytical rea- 
soning variable, a direct test in which the 
parameters were constrained to be equal in 
the two samples to determine if there was 


a significant loss of fit to the data indicated 
that the group differences were not statis- 
tically significant. It therefore appears from 
these results that the two reasoning variables 
represent nearly the same combination of 
abilities at different ages. These particular 
results should be replicated before reaching 
any strong conclusions, but they serve to il- 
lustrate how correlational results can be in- 
formative about the possibility of qualitative 
differences in performance at different ages. 


Hierarchical Structure Models 


The correlated-factors model can be con- 
sidered relatively simple because, although 
the factors are allowed to correlate with one 
another, there is no attempt to explain the 
basis for those correlations in the context 
of the model. A somewhat more compli- 
cated model involves a hierarchical structure 
in which one or more higher-order factors 
are postulated to be responsible for the rela- 
tions among the first-order factors (Carroll, 
1993). An advantage of hierarchical models 
for the investigation of age-related effects is 
that they allow broad (on the higher-order 
common factor) and narrow (on the first- 
order ability factors) age-related influences 
to be examined simultaneously. 

A hierarchical analysis was conducted 
on the combined data summarized in 
Table 24.1 by examining the relations of age 
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Predicted: .25 -.43 
Observed .25 -.41 


-.49 -.49 -.63 
-.49 -.48 -.63 


Figure 24.4. Hierarchical structural model of age relations on different cognitive 
abilities based on the data summarized in Table 24.1. Numbers adjacent to the 
arrows are standardized regression coefficients, and numbers in the bottom two 
rows are correlations between age and the latent construct directly above. 


to a second-order factor representing vari- 
ance common to the first-order factors and 
to each first-order factor, and then deleting 
all relations from age that were not signifi- 
cantly different from zero. Because the ag- 
gregation of data from samples with differ- 
ent combinations of variables results in a 
very high proportion of missing values for 
most variables, conventional measures of fit 
are not readily available in analyses with this 
type of data. However, the observed age- 
factor correlations can be compared with 
those predicted from the parameters of the 
model, and inspection of the entries at the 
bottom of Figure 24.4 indicates that the pre- 
dicted age correlations were very close to the 
observed age correlations, implying that the 
model is plausible. 

The coefficients provided from the hi- 
erarchical structure analysis on these data 
are portrayed in Figure 24.4, where it can 
be seen that four statistically independent 
age-related influences were identified. There 
was a large negative influence of age on the 


highest-order factor, a moderate positive in- 
fluence on the vocabulary factor, and small 
to moderate negative influences on factors 
corresponding to speed and memory abil- 
ities. A very similar pattern recently was 
found by Salthouse and Ferrer-Caja (2003) 
in analyses of three separate data sets, so 
these results apparently are robust. 

The hierarchical structure represented 
in Figure 24.4, together with the factor 
loadings presented in Table 24.1, can be 
used to estimate age-related influences on 
individual variables. Because the product 
of the standardized path coefficients pro- 
vides an estimate of the expected correla- 
tion between the variables, the product of 
the age-common, common-reasoning, and 
reasoning-variable coefficients can be com- 
pared with the observed age-variable cor 
relation to determine how accurately the 
model and its estimated parameters repro- 
duce the actual relations in the data. The 
predicted age correlation for the matrix rea- 
soning variable was —o.42, the observed 
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predicted and observed values for the an- 
alytical reasoning variable were —o.37 and 
—o.46, respectively. With these particular 
variables, therefore, the age relations are un- 
derestimated by the model, which implies 
that additional paths, such as a direct neg- 
ative relation from age to the variable, may 
be necessary to provide more accurate esti- 
mates of the true covariations in the data. 

One of the most interesting results in Fig- 
ure 24.4, which was also apparent in the 
analyses reported by Salthouse and Ferrer- 
Caja (2003), is that the reasoning factor was 
the first-order factor with the strongest re- 
lation to what is common to all variables. 
In fact, the standardized coefficient of 0.97 
in Figure 24.4 indicates that there was al- 
most complete overlap of the individual dif- 
ferences in the reasoning factor with the in- 
dividual differences in what is common to 
several different cognitive abilities. This find- 
ing is intriguing because it suggests that an 
explanation of the age differences in reason- 
ing likely also will explain much of the age- 
related influences on other cognitive abili- 
ties, and vice versa. 


Conclusions and Future Directions 


Large age differences have been found in 
many measures of reasoning, and in some 
cases the differences are as large as those 
found in measures of other cognitive abilities 
such as memory. There still is no convinc- 
ing explanation of the causes of age-related 
effects on reasoning, although the available 
evidence suggests that aspects of WM likely 
contribute to at least some of these effects. 
Results of correlational analyses suggest that 
reasoning variables are central to what is 
common across a wide variety of cognitive 
abilities and to the age differences in differ- 
ent cognitive abilities. It therefore seems rea- 
sonable to expect that an understanding of 
age-related effects on reasoning may help ex- 
plain much of the age-related differences in 
a broad variety of cognitive variables. Finally, 
because of the centrality of reasoning to the 
individual differences in much of cognitive 


efit from a broader, more multivariate, per- 
spective than that typically employed in con- 
temporary research and by considering the 
effects of aging on what is common to many 
different types of cognitive variables instead 
of focusing exclusively on the determinants 
of age-related differences in one particular 
task. 
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CHAPTER 25 


Reasoning and Thinking in 
Nonhuman Primates 


Josep Call 
Michael Tomasello 


Fifty years ago, a chapter with the title 
“Reasoning and Thinking in Nonhuman Pri- 
mates” would have been a very short chapter. 
Behaviorists, of course, did not believe in rea- 
soning and thinking, and people who studied 
animals in their natural habitats (eventu- 
ally known as ethologists) were interested 
in other things. In the 1960s, the cognitive 
revolution transformed the way psycholo- 
gists studied human behavior and cognition, 
but much of this research was about hu- 
man symbolic, propositional representations 
(“the language of thought”) and was not eas- 
ily applied to research with nonhuman an- 
imals. The cognitive revolution thus came 
to the study of animal behavior only very 
slowly. But during the past two decades, 
it has arrived, and in the modern study of 
animal behavior, questions of cognition are 
among the most prominent. 

Scientists who study animals typically 
have a background in biology, so every- 
thing flows from the theory of evolution. 
These behavioral biologists and psychobiol- 
ogists are interested in how animals adapt to 
their environments — both physically and be- 
haviorally. In this context, some behavioral 


adaptations may be considered cognitive in 
the sense that they involve the individual or- 
ganism’s learning and reasoning and thinking 
on the basis of its own individual experience 
before deciding on the best way to act in 
a given circumstance. There are specifiable 
ecological circumstances in which evolution 
favors the greater flexibility afforded by cog- 
nitive adaptations, as opposed to, for exam- 
ple, hardwiring specific behavioral responses 
to specific environmental stimuli (Boyd & 
Richerson, 1985). 

In the case of nonhuman primates in par- 
ticular, there were actually two pioneers in 
cognitive research in the early part of the 
twentieth century. In Germany, Wolfgang 
Kohler was a Gestalt psychologist inter- 
ested in intelligence as something that took 
organisms beyond punctate sensations and 
blind trial-and-error learning. He studied a 
small group of chimpanzees in a variety 
of problem-solving situations, looking for 
cases of perceptual restructuring and insight 
(Kohler, 1925). In America, Robert Yerkes 
studied a variety of behavioral phenomena 
in a number of primate species. His work in- 
cluded studies in which animals had to solve 
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complex cognieversatthynchtips tiga tiidnacy Gioresearch. Here we explore two aspects: 


dle part of the century, behaviorists studied 
such things as the speed with which differ- 
ent species could be taught through reward- 
based training to make discriminations, form 
learning sets, and the like (Harlow, 1959; 
Rumbaugh, 1970) — phenomena which, to- 
day, could be given interesting cognitive in- 
terpretations. 

The most exciting work in the modern 
context comes under the two titles “com- 
parative cognition” and “cognitive ethology,” 
however. The former often refers to exper- 
imental work in the laboratory, and the lat- 
ter often refers to observational work in the 
natural environment. Ideally, for any given 
phenomenon, the two approaches provide 
complementary types of information. 

Our aim in this chapter is to provide an 
up-to-date overview of research on think- 
ing and reasoning in nonhuman primates 
(henceforth, simply “primates”). Thinking 
and reasoning, in our view, are character 
ized by mental transformations or leaps, not 
just direct perception or memory of partic- 
ular stimuli; going “beyond the information 
given.” We therefore focus on primates solv- 
ing novel problems — that is, those that re- 
quire them to do more than simply learn 
and remember. In terms of content, we focus 
on topics that constitute aspects of human 
cognition represented by other chapters in 
the current volume, focusing in each case on 
both selected classic studies and the latest re- 
search findings. Our main topics are spatial 
(Tversky, Chap. 10), relational, analogical 
(Holyoak, Chap. 6), inferential, quantita- 
tive (Gallistel & Gelman, Chap. 23), causal 
(Buehner & Cheng, Chap. 7), and social rea- 
soning and thinking. Although the chapter is 
mainly about primates, readers interested in 
fuller accounts of animal cognition in general 
are referred to books published in the past 
few years (Pepperberg, 1999; Roberts, 1998; 
Shettleworth, 1998; Tomasello & Call, 1997; 
Vauclair, 1996). 


Spatial Reasoning 


The spatial behavior and cognition of pri- 
mates and other animals is a very large field 


(1) how individuals navigate in large-scale lo- 
comotor space while traveling and (2) how 
individuals search for objects more locally 
in small-scale manipulatory space. In both 
cases, the key skills involved in thinking and 
reasoning enable an individual to predict 
things — namely, the best path for its own 
locomotion or the likely future position of 
moving objects. 


Travel Strategies 


DETOURS 


The use of detours was one of the main 
issues investigated by Kéhler (1925) with 
chimpanzees. He found that they were ca- 
pable of taking alternative routes to a goal 
when the direct route was blocked. Since 
then, little additional research has been done 
except in other animals species such as 
chickens or dogs. Recently, however, several 
researchers used computerized systems to 
present mazes. Here, the subject does not 
move, but it moves a cursor through the 
maze to get to a goal box. This is a good 
tool to investigate detours because mazes 
often involve the use of detours in which 
subjects have to move away from the direct 
approach to the goal box and use an indi- 
rect route to reach it. Iversen and Matsuzawa 
(2001) trained two chimpanzees to navigate 
through mazes presented on a computer 
touch screen. Chimpanzees gradually mas- 
tered a series of mazes of increasing diff- 
culty. One of the chimpanzees learned to use 
detours when the route on a familiar maze 
was blocked and later was able to use detours 
on novel mazes. The authors indicated, how- 
ever, that subjects did not fully develop a 
generalized ability to solve mazes, and some 
practice with the particular mazes seemed 
to be required to solve the problem. 


SHORTCUTS 


Fieldworkers often report that several 
species of primates travel from certain lo- 
cations to others in an efficient manner — 
that is, taking the shortest routes possi- 
ble (Garber, 1989; Sigg, 1986). Menzel 
(1973) tested the ability of four young 
captive chimpanzees to use least-distance 
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ing the most food rewards in a large enclo- 
sure. He found that chimpanzees minimized 
the distance traveled. Similarly, Boesch and 
Boesch (1984) found that wild chimpanzees 
traveled efficiently when collecting stones 
needed to crack open nuts. They selected 
stones that were closer to their current loca- 
tion. Recently, a number of researchers have 
described the use of least-distance strategies 
in vervet monkeys, common marmosets, and 
yellow-nosed monkeys (Cramer & Gallistel, 
1997; MacDonald, Pang, & Gibeault, 1994; 
MacDonald & Wilkie, 1990). 

There also is a computerized version of 
the shortcut task. Washburn (1992; see also 
Washburn & Rumbaugh, i992) presented 
rhesus monkeys with a moving target on a 
screen that they had to intercept with a cur 
sor that subjects controlled with a joystick. 
To do so appropriately, subjects not only had 
to chase the target but on many occasions 
had to predict its location and use shortcuts 
to ambush it because the target speed was 
equal or superior to that of the cursor. In 
other words, they had to take shortcuts to in- 
tercept the moving target. This skill may be 
more demanding than using shortcuts when 
traveling between various food sources be- 
cause subjects have to adapt to a moving tar- 
get. This may be a useful skill in intercept- 
ing prey or competitors who hold valuable 
resources. These authors found that sub- 
jects again were more effective at intercept- 
ing targets when they followed predictable 
rather than unpredictable paths. Although 
subjects required some experience to learn 
the paths of the targets, results from pre- 
sentation of novel target paths (e.g., the tar 
get disappearing on the top and reappearing 
on the bottom of the screen) suggested that 
monkeys had learned a general rule about 
the target’s behavior rather than a set of 
stimulus—response associations. In a similar 
vein, Iversen and Matsuzawa (2001) indi- 
cated that when mazes had one short and a 
long route to get to the goal box, the chim- 
panzee selected the short one. 


SEARCH FOR MOVING OBJECTS 


Several studies investigated the ability of pri- 
mates to retrieve objects after they have un- 


object permanence tasks, the experimenter 
places a piece of food under a small container 
that is displaced under several other contain- 
ers and the food is left under one of them. To 
solve this problem effectively, subjects have 
to search under all and only boxes under 
which the food might have been deposited 
given the trajectory of the box that initially 
contained the food. Several apes pass this 
task but monkeys do not (Call, 2000; De 
Blois, Novak, & Bond, 1998; De Blois & 
Novak, 1994; Dumas & Brunet, 1994; Natale 
et al., 1986), although there are individual 
exceptions (Schino, Spinozzi, & Berlinguer, 
1990). Apes also have problems if the two 
locations visited are not adjacent; that is, the 
experimenter visits the box on the right and 
the left, leaving the center box untouched 
(Call, 2000; De Blois, Novak, & Bond, 1998; 
Natale et al., 1986). This is interpreted as 
reconstructing the trajectory of the reward. 

Other types of displacements recently 
have been investigated with apes. In ro- 
tational displacements, a reward is hidden 
under one of two cups and the platform is 
rotated circularly — for instance, 180 degrees. 
In transpositions, the reward is placed under 
one of various containers and their locations 
are swapped while the platform remains 
stationary. Results show that chimpanzees, 
orangutans, and bonobos are capable of solv- 
ing these displacements (Beran & Minahan, 
2000; Call, 2003). Taken together, this 
means that subjects can track a variety of dis- 
placements based on the movement of the 
object (object permanence), the containers 
(transpositions), or the substrate on which 
the object and containers rest (rotations). All 
these result in changes in the location of the 
object and subjects can infer its position. 

In summary, primates are capable of trav- 
elling efficiently by using detours and short- 
cuts and they track the displacement of 
hidden objects and infer their new locations 
after various spatial transformations. 


Relational Reasoning 


In simple discrimination problems, subjects 
learn to respond to a single stimulus or to a 
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tion. Discrimination learning of relational 
categories, on the other hand, involves con- 
cepts that can be learned only by compar- 
ing stimuli to one another and inducing a 
relation (e.g., “same as” “larger than”). The 
three most studied instances of relational 
concepts are the identity relation as manifest 
in generalized match-to-sample problems, 
the oddity relation as manifest in general- 
ized oddity problems, and the sameness— 
difference relation as manifest in general- 
ized relation-matching problems. In all three 
cases, the basic idea is that the subject is 
given some problems in training that can be 
solved by attending to a relation and then is 
given transfer tests that use completely dif- 
ferent objects that can be seen as instanti- 
ations of that same relation. If learning is 
relatively fast in the transfer phase, the infer- 
ence is that the subject acquired a relational 
concept in the training phase and is now ap- 
plying it in the transfer phase. If the learning 
is at the same basic rate in training and trans- 
fer (with some allowance for the formation 
of a learning set), the inference is that the 
subject has not learned a relational concept 
but is treating each new problem as a sepa- 
rate entity with its own particular stimulus 
characteristics. 


Identity 


Several studies have shown that monkeys 
and chimpanzees can solve identity prob- 
lems based on generalized matching to sam- 
ple (D’Amato & Salmon, 1984; D’Amato 
et al., 1986; Nissen, Blum, & Blum, 1948; 
Oden, Thompson, & Premack, 1988). In 
the only study of which we are aware in 
which human children were tested in this 
same type of procedure, they, like the chim- 
panzees, generalized immediately to new 
match-to-sample problems using only two 
sets of stimuli in training (Weinstein, 1941). 
It should be noted, however, that this suc- 
cessful performance with one stimulus di- 
mension (e.g., shape) does not generalize in 
most studies across other stimulus dimen- 
sions (D’Amato & Colombo, 1985; Jackson 
& Pegram, 1970a, 1970b; Kojima; 1979; 


Jitsumori, 1990). Monkeys trained with 
shapes and capable of solving identity prob- 
lems with novel shapes, for instance, do 
not transfer their identity concept to other 
dimensions such as color (see Doumas & 
Hummel, Chap. 4, for a discussion of 
relational generalization). The rule that 
monkeys seem to learn is therefore better 
characterized as “pick the same shape” rather 
than “pick the same.” 


Oddity 


Numerous studies demonstrate that many 
primate species can acquire the concept 
of oddity, as evidenced by their ability to 
solve novel problems after a period of train- 
ing (King & Fobes, 1982; Rumbaugh & 
McCormack; 1967; Thomas & Boyd, 1973). 
Some primate species have also been able 
to solve dimension-abstracted oddity prob- 
lems in which the odd object must be distin- 
guished from four other alternatives that are 
not identical to one another (as in traditional 
oddity problems) but only resemble one 
another with respect to some dimensions 
(e.g., objects of different shapes that are all 
red, as opposed to the odd object, which 
is blue). Macaques, squirrel monkeys, chim- 
panzees, and gorillas were capable of solv- 
ing this problem (Bernstein, 1961; Thomas 
& Frost, 1983). Human children have been 
presented with oddity problems in a number 
of studies and generally perform very well in 
the earliest trials of transfer (e.g., Lipsett & 
Serunian, 1963). 


Sameness-Difference 


In the previous two tasks, subjects have to 
respond either to similarity or difference. 
Some tasks have investigated whether sub- 
jects can decide whether a pair of stim- 
uli are similar or different simultaneously. 
Several monkey species, chimpanzees, and 
orangutans were capable of judging whether 
two stimuli were “same” or “different” 
(Wright, Santiago, & Sands; 1984; Fujita, 
1983; King & Fobes; 1975; Robinson, 1955, 
1960; King, 1973). These studies invari- 
ably involved subjects making judgments 
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level. Recently, however, Bovet and Vauclair 
(2001) investigated the ability of baboons 
to make same—different judgments based 
on the functional properties of the stimuli. 
They presented baboons with pairs of stim- 
uli corresponding to two different categories 
(i.e., food vs. nonfood). The items belong- 
ing to these two categories varied in their 
perceptual features. Results indicated that 
baboons were capable of judging as “same” 
items belonging to the same category despite 
their perceptual dissimilarities and as “differ- 
ent” items belonging to different categories. 


Analogical Reasoning 


Premack (1983) argued that identity, odd- 
ity, and sameness—difference tasks as tradi- 
tionally administered do not require the kind 
of relational concepts that investigators have 
claimed. Because the matching takes place 
across trials in all of these tasks, he claimed 
that “the animal simply reacts to whether it 
has experienced the item before. Old/new 
or familiar/unfamiliar would be better tags 
for this case than same/different” (Ref. 86, 
p. 354). Instead, he advocated use of a gener- 
alized match-to-sample procedure in which 
the matching to be accomplished involves 
the relations between items. Premack (1983) 
presented chimpanzees with a sample pair of 
stimuli that either matched (so-called AA 
pairs, such as two apples) or that did not 
match (so-called CD pairs, such as a pear and 
an orange). Their task was to pick which of 
two alternatives matched the relation exem- 
plified in the sample - either a pair of new 
items that matched (a so-called BB pair, such 
as two bananas) or a pair of new items that 
did not match (a so-called EF pair, such as 
a plum and a grape). When the sample was 
AA, the subject was to choose BB (rather 
than EF) because the relation between items 
in both cases is one of “sameness.” If the 
sample was CD, the subject should choose 
EF (rather than BB) because the relation 
between items in each case was one of 
“difference.” 


presented the language-trained chimpanzee 
Sarah (Premack, 1976) with pairs of objects 
that had various relations; Sarah’s job was 
to identify another pair that had an analo- 
gous relation. In so-called figural problems, 
Sarah was presented with an odd shape with 
a dot on it and that same shape without the 
dot; she was then presented with another 
shape with a dot and had to choose from 
a pair of alternatives that same shape with- 
out the dot (i.e, the analogous relation of 
two shapes with and without a dot). In so- 
called conceptual problems Sarah was pre- 
sented with household items with which she 
was familiar and asked to draw analogies, for 
example, between a key and lock and a can 
opener and can. On figural items, Sarah per- 
formed correctly about three-quarters of the 
time and on conceptual items she was cor- 
rect at a slightly higher rate. Having ruled 
out various possible alternative explanations, 
the investigators concluded that Sarah was 
able to understand the relation in the first 
pair of stimuli at a level of abstraction suf- 
ficient to allow her to identify it in subse- 
quent stimulus pairs, both perceptually and 
conceptually. 

Recently, Thompson, Oden, and Boysen 
(1997) found that language-naive chim- 
panzees were also able to solve analogies if, 
prior to testing, they had been trained to 
associate a token of one shape with pairs 
of similar items and a different token for a 
pair of items that were not similar. Presented 
with the same token, they selected the sim- 
ilar pair, and presented with the different 
token, they selected the pair with unequal 
items (this is comparable to Burdyn and 
Thomas, 1984, in which squirrel monkeys 
used a figure to choose between the identical 
or the different objects in the pair). On the 
testing phase, subjects were presented with 
a pair of identical or different objects as a 
sample and two choices (one that bore the 
same the relation as the sample and another 
with a different relation). Chimpanzees per- 
formed above chance, indicating that they 
could identify the relation between relations. 
In contrast, rhesus monkeys presented with 
an analogous procedure were unable to solve 
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solve other kinds of relations such as cor 
rectly identifying the perceptual analogies 
(Washburn, Thompson, & Oden, cited in 
Thompson & Oden, 2000). 

Several authors have indicated that only 
chimpanzees that had received some train- 
ing that involved using tokens represent- 
ing “same” and “different” were able to 
solve analogies (Premack, 1983; Thompson 
& Oden, 2000). They argued that learn- 
ing a symbolic code such as language fun- 
damentally changed the nature of the cog- 
nitive representations used by chimpanzees 
by providing them with an abstract proposi- 
tional code (rather than a concrete imaginal 
code) in terms of which they might inter- 
pret their experience. This idea, however, 
has been challenged from two main direc- 
tions. First, Oden, Thompson, and Premack 
(1990) used a different procedure and found 
that four infant chimpanzees (around one 
year of age with no language training) also 
engaged in the matching of relations. They 
simply presented the subject with a sample 
stimulus that consisted of a pair of objects 
mounted on a small board; the pair could 
either match (AA) or not match (CD). Sub- 
jects could play with this sample as desired, 
and their play time was recorded. They were 
then presented with two test pairs of objects, 
also mounted on board, that they might play 
with; one was a matching pair (BB) and one 
was a nonmatching pair (EF). Subjects’ ini- 
tial play with the sample affected their han- 
dling time with the new test pairs. If sub- 
jects had played with the sample pair that 
matched (AA) they were no longer inter- 
ested in the matching relation and so played 
more with the nonmatching test pair (EF); 
if they had played with the nonmatching 
sample (CD), they played more with the 
matching test pair (BB). The conclusion of 
these investigators was that chimpanzees can 
understand relations among relations, even 
if they do not always show this competence 
in tasks in which they must actively choose 
stimuli. The modified conclusion of these 
authors was that, although chimpanzees un- 
derstand second-order relations, language 
training helps them incorporate this into 


children perform like chimpanzees and dis- 
tinguish the relations, whereas monkeys do 
not (Thompson & Oden, 2000). Thompson 
and Oden (2000), however, have indicated 
that these represent implicit rather than ex- 
plicit judgments of the kind shown in gen- 
eralized relation-matching tasks (see Litman 
& Reber, Chap. 18 on implicit thinking). 

Second, and more importantly, Vonk 
(2003) has recently shown that orangutans 
and a gorilla can solve analogies without 
any token experience or extensive training 
(see also Smith et al., 1975) using a delayed 
matching to sample (DMTS) task in which 
subjects had to match the relation repre- 
sented by a pair of geometric figures to those 
of one of the two alternatives provided — the 
same method used by other authors to test 
analogical reasoning in chimpanzees. There 
is also a study with baboons that showed 
they can match a sample depicting a set 
of identical or different items to the corre- 
sponding alternative (Fagot, Wasserman, & 
Young, 2001). Unlike previous studies with 
apes, however, baboons reached high perfor- 
mance only when the sample and alterna- 
tives were formed by multiple items. When 
the number of items was reduced, there was 
a clear decrement in accuracy, particularly 
for the “different” samples. This effect, as 
well as the extensive training involved (ba- 
boons received thousands of trials before 
they mastered the initial task), opens the 
door to other interpretations based on the 
perception of perceptual entropy. 

One area in analogical reasoning that 
has received some recent research attention 
from a comparative perspective is that of 
spatial analogy. Using a task pioneered by 
DeLoache and colleagues (DeLoache 1995), 
Kuhlmeier, Boysen, and Mukobi (i999) 
tested the ability of chimpanzees to make 
spatial analogies. Subjects were presented 
with a very accurate three-dimensional scale 
model of a room and subjects witnessed 
how the experimenter placed an object (e.g., 
soda can) in a particular location in the 
scale model (e.g., inside a cupboard). Sub- 
jects then moved to the real room and were 
allowed to search the room. Chimpanzees 
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accurately predict the location of the ob- 
ject in the room and vice versa; they were 
able to point to a location in the scale model 
that corresponded to a location in the actual 
room. Initially, female chimpanzees were 
more proficient than males at this task. Male 
chimpanzees tended to search the room in 
a predetermined pattern until they eventu- 
ally found the object rather than going to the 
specific places indicated in the scale model. 
When reward delivery was made contingent 
on visiting the specific location on their first 
try, however, males’ performance became 
comparable to that of female chimpanzees 
(Kuhlmeier & Boysen, 2001). 

In summary, primates are capable of per- 
ceiving various types of relations between 
objects. Moreover, apes can solve analogies 
regarding the similarity or difference be- 
tween pairs of objects, and chimpanzees can 
also solve spatial analogies involving the use 
of scale models. 


Inferential Reasoning 


Transitivity 


The use of transitive inference has re- 
ceived much research attention. Although 
human studies on transitivity (see Halford, 
Chap. 22) have often used stimuli that vary 
systematically and naturally along a quanti- 
tative dimension such as height (e.g., Piaget, 
1952), most studies with primates have used 
so-called associative transitivity. This con- 
sists of presenting subjects with pairs of ar- 
bitrary stimuli and differentially reinforcing 
one of the stimulus of the pair, thereby creat- 
ing different values. For instance, the red cup 
is always reinforced when presented with 
the blue cup, whereas the blue cup is always 
reinforced when presented with the yellow 
cup, and so on. Once the initial pairs are 
trained, subjects are presented with pairs of 
stimuli that have not been paired before, for 
instance, red versus yellow cup. 

There is ample evidence showing that 
primates can make transitive inferences 
when subjects are presented with novel 


Colombo, 1988; Gillan 1981; Boysen et al., 
1993). This includes cases in which sub- 
jects have been trained with more than three 
stimuli. This is important because the most 
interesting cases are those that involve in- 
termediate stimuli — that is, stimuli that are 
not the first or the last of the sequence, be- 
cause those are always or never reinforced, 
respectively. D’Amato and Colombo (1988) 
trained capuchin monkeys to touch five ar- 
bitrary items in a specified order (labeled 
A, B, C, D, and E). After they had mas- 
tered this task they were presented with 
novel pairs. Of particular importance were 
the internal pairs B-C, C-D, and B—-D. The 
B-D comparison was especially important 
because these two items were both internal 
to the series and were nonadjacent to one 
another in the previous training. Subjects 
ordered these three internal pairs correctly 
81% to 88% of the time, well above chance. 
When presented with triplets from which 
they were to choose the highest item, they 
ordered the internal triplet B-C-D correctly 
94% of the time, also well above chance. 
This finding essentially replicates, with even 
stronger results, the findings of McGonigle 
and Chalmers (1977) with squirrel monkeys. 
These authors also found evidence for a sym- 
bolic distance effect — the farther apart two 
items, the more successful the subjects, pre- 
sumably because the items were easier to 
distinguish. 

One open question is, What is the mecha- 
nism responsible for this performance? Two 
mechanisms have been postulated — the as- 
sociative mechanisms based on responding 
to the differential reinforcement and asso- 
ciative strength of the stimuli, and the re- 
lational or linear mechanism based on cre- 
ating a mental order of the stimuli. Bond, 
Kamil, and Balda (2003) argued that, under 
an associative mechanism, errors increase at 
the end of the sequence, whereas latencies 
should be unaffected regardless of the po- 
sition of the items. In contrast, the rela- 
tional mechanism predicts that accuracy will 
remain unchanged, whereas the latency to 
respond will be affected. First, subjects’ la- 
tency to respond to the first item of a pair 
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ries: They responded most quickly to pairs 
in which the first item was A, then for pairs 
in which the first item was B, then C, then 
D. The implication is that each time they are 
presented with a pair, the subjects are men- 
tally reconstructing the entire five-item se- 
ries (D’Amato & Colombo, 1988). Second, 
animals responded most quickly to the sec- 
ond item of a pair for pairs with adjacent 
items (e.g., A-B, C-D, etc.), then for pairs 
separated by one gap (e.g., A-C, C-E, etc.), 
then for pairs separated by two gaps (i.e., 
A-D, B-E), and they were slowest on the 
second item when the gap was three (A-E). 
Again, the implication is that subjects are 
going through the entire series mentally on 
every trial. Swartz, Chen, and Terrace (1991) 
essentially replicated these results — both in 
terms of ordinal judgments and in terms of 
reaction times — for rhesus macaques. 
Although these results are quite convinc- 
ing, D’Amato and Colombo (1989) pointed 
out that the results of this study are com- 
patible with an associative chain interpre- 
tation in which each item simply serves as 
a discriminative stimulus evoking the next 
item, obviating the need for some represen- 
tation of serial order. To investigate whether 
capuchin monkeys were also associating a 
specific serial position with each item in the 
associative chain, D’Amato and Colombo 
(1989) used a procedure that essentially 
broke the chain. Using monkeys who had 
already learned the ABCDE sequence, on 
some trials they introduced a “wild card” 
item at a particular point in the sequence 
(eg., ABCXE). This was a novel item that 
had never been used as part of the training 
and therefore had no associations with any 
other items. These investigators found that 
no matter the position in which the wild card 
item appeared, subjects treated it in a man- 
ner similar to the item it replaced at above- 
chance levels, touching it at the appropriate 
place in the sequence approximately 60% of 
the time. They performed just as well with 
sequences containing two wild card items. 
Consequently, D’Amato and Colombo ar- 
gued that the monkeys in this study, and 
presumably in previous studies, were operat- 


chain; they were operating with some men- 
tally represented sequence of items in which 
the ordinal position of each item was essen- 
tial information. 


Ordinality 


Many of the studies on transitivity have al- 
ready indicated that monkeys learn some- 
thing about the linear representation in a se- 
ries — they learn the order in which the items 
should appear. Some studies have pushed 
this argument a bit further and have substi- 
tuted boxes by Arabic numerals represent- 
ing different quantities. Boysen et al. (1993) 
found that after chimpanzees were trained 
with pairs of adjacent numerals in the same 
pairwise as used in the studies previously re- 
ported, they were then presented with the 
novel pair 2-4. In this study, after appropri- 
ate training with the initial pairs, subjects 
all were able to successfully choose the 4 
over the 2 in the novel pairwise test. The 
investigators concluded that with appropri- 
ate training, chimpanzees can learn the se- 
rial order of symbolic stimuli. Washburn 
and Rumbaugh (1991) taught two rhesus 
macaques to associate Arabic numerals with 
the reception of a corresponding number of 
food pellets. Because monkeys try to max- 
imize their food intake, they learned to se- 
lect the larger quantity represented by the 
various numerals that were presented to 
them. The authors reserved some of the pairs 
for their transitivity tests. One of the two 
subjects was above chance in choosing the 
larger member of the novel pair in the very 
first set of trials. Similarly, presented with 
five numerals simultaneously, result of both 
were above chance immediately in choosing 
the largest one. These investigators interpret 
their results as indicating that the monkeys 
formed a representation of a “matrix of val- 
ues” corresponding to the numerals. 
Additional evidence for ordinality is pro- 
vided by two studies not based on a transi- 
tivity paradigm. First, Brannon and Terrace 
(1998) trained rhesus macaques to touch 
a series of stimuli depicting different nu- 
merosities in ascending order. Initially, the 
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picting numerosities ranging froma 1 to 4. In 
transfer tests, monkeys were able to solve 
problems involving numerosities ranging 5 
to 9. The authors argued that their results 
demonstrated that rhesus macaques can rep- 
resent numerosities ranging from 1 to g in an 
ordinal manner. In a follow-up study, Bran- 
non and Terrace (2000) trained monkeys to 
touch a series of stimuli in descending or 
der. They failed to transfer the descending 
rule into new numerosities, however. Mon- 
keys also failed to learn to select numerosi- 
ties in a monotonic series despite extensive 
training. This suggests that ordinality may be 
an especially salient dimension for monkeys. 
The accuracy in responding and the latency 
indicated distance effects similar to those 
in other studies, including human studies 
(e.g., Moyer & Landaeur, 1967). Second, 
Kawai and Matsuzawa (2000) showed that 
the chimpanzee Ai was capable of select- 
ing Arabic numerals presented on the com- 
puter screen in ascending order. She had 90% 
accuracy with four-numeral series and 65% 
accuracy with five-numeral series. The re- 
sponse latency was longest for the first item 
in the series compared with the remaining 
ones, which suggests the chimpanzee was 
planning the sequence before executing the 
entire sequence. 


Conjunctive Negation 


This refers to the ability to infer that if a 
given object can be located in one of two 
containers and, upon searching the first con- 
tainer, is not found there, then it must be in 
the other one. Premack and Premack (1994) 
presented chimpanzees with two boxes and 
two types of fruit, such as a banana and an 
apple. Chimpanzees were allowed to witness 
the experimenter deposit each fruit in one 
of the boxes so that both boxes were baited. 
Later, subjects saw the experimenter eating 
one of the fruits (e.g., banana) and the ques- 
tion was whether given the opportunity to 
select either box, they would select the one 
in which the experimenter had deposited 
the food he was not currently eating (i.e., 
apple), presumably because it still contained 


pyNeOIthe fruit. Chimpanzees solved this prob- 


lem quickly, without trial-and-error, show- 
ing that they were able to infer that if the ex- 
perimenter was eating the banana, the box 
where the banana was deposited would be 
empty. 

More recently, Call (2004) presented all 
four great apes with two cups (one baited) 
and gave visual or auditory information 
about the contents of one or both cups. Vi- 
sual information consisted of removing the 
top of the cup so that subjects could look 
inside it. Auditory information consisted of 
shaking the cup so that it produced a rattling 
sound when the food was inside. Subjects 
correctly selected the baited cup both when 
they saw the food and when they heard it. 
More importantly, subjects also selected the 
correct cup when only the empty cup was ei- 
ther shown or shaken. This means that sub- 
jects chose correctly without having seen or 
heard the food. Control tests showed that, 
in general, subjects were not more attracted 
to noisy cups or avoided shaken noiseless 
cups. Also, subjects were unable to learn to 
use other comparable auditory cues such as 
tapping on the baited cup to find the food. 
The author argued that apes made inferences 
about the food location, rather than just as- 
sociating an auditory cue with the reward. 
This suggests that subjects understood that 
the food caused the noise, not simply that 
the noise was associated with the food. 

There are also two studies in which chim- 
panzees were able to solve inferential ex- 
clusion in a matching to sample paradigm. 
Hashiya and Kojima (2001) presented a 
chimpanzee with two pictures of people 
she knew and the voice of one of them. 
The chimpanzees successfully matched the 
voice with the correct picture. Then Hashiya 
and Kojima (2001) presented her with two 
pictures (one of someone she knew and 
the other of someone she did not know) 
and an unfamiliar voice. The chimpanzee 
correctly matched the unfamiliar voice to 
the unfamiliar picture. Beran and Washburn 
(2002) presented chimpanzees with pictures 
and lexigrams as samples and alternatives, 
respectively. Pictures and lexigrams could 
be either familiar or unfamiliar. Familiar 
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tures and lexigrams, respectively, that sub- 
jects had learned to associate before the test. 
As expected, chimpanzees reliably selected 
the familiar appropriate lexigrams repre- 
senting the sample pictures, but, in addi- 
tion, chimpanzees also were able to select 
unfamiliar lexigrams when presented with 
familiar lexigrams and an unfamiliar picture. 
Their success in this task did not translate 
into the acquisition of the unfamiliar lexi- 
gram as a representation for the unfamiliar 
picture, however. 

In summary, primates are capable of 
making inferences about pieces of missing 
information in transitivity and conjunctive 
negation problems of various types. 


Quantitative Reasoning 


Primates can perform operations on quanti- 
ties by adapting to novel arrays when some 
quantities are added, subtracted, or simply 
change in appearance but remain constant. 
They also have some skills in counting, on 
which these more complex skills depend. 


Counting 


Rumbaugh and coalleagues (Beran & 
Rumbaugh, 2001; Rumbaugh et al., 1989) 
presented two chimpanzees with a com- 
puter task in which subjects had to collect 
the number of dots from the bottom of 
the screen specified by an Arabic numeral 
presented on the top of the screen. Subjects 
indicated when they had finished their 
selection with the use of the cursor. The 
chimpanzees performed above chance 
with the numerals up to six and seven, 
respectively. Rumbaugh et al. (1989) also 
indicated that the chimpanzee Lana could 
solve this task even if the squares disap- 
peared as she touched them (with a tone 
sounding as they disappeared) — implying 
that she could keep track mentally of how 
many she had already touched. The authors 
ruled out some explanations such as subitiz- 
ing or using the temporal pattern rather 
than the number of dots as the basis for 


argued that chimpanzees’ performance 
decreased proportionally to the magnitude 
of the numerals presented. The authors 
argued that the chimpanzees seemed to rep- 
resent quantities in a continuous rather than 
a discrete fashion. This characterization 
differs in some way from that of Rumbaugh 
et al. (1989), who indicated that these 
subjects were counting in a way similar to 
human children in that they knew not only 
the ordinality but also the cardinality of the 
Arabic numerals involved. 

Boysen and Berntson (1989) trained the 
chimpanzee Sheba to count objects using 
Arabic numerals using a different method. 
First, they administered a one-to-one corre- 
spondence task in which she had to place one 
and only one object in each of six compart- 
ments of a divided tray. Then, she was re- 
quired to pick a card with the same number 
of dots as the number of food items (rang- 
ing from one to three pieces) presented ona 
tray. The researchers then replaced the cards 
with dots with cards having Arabic numer- 
als and continued the training until Sheba 
was able to select the Arabic numeral corre- 
sponding to the number of dots on a card. 
Finally, the authors trained the subject in 
Arabic numeral comprehension so that, pre- 
sented with an Arabic numeral, she had to 
select the card with the corresponding num- 
ber of dots. After she mastered these tasks, 
Boysen and Berntson conducted two trans- 
fer tests. First, they presented her with one, 
two, or three common household items and 
asked her to pick the corresponding Ara- 
bic numeral, which she readily did. Second, 
they introduced the Arabic numerals 4, 5, 
and o directly (without first using cards with 
dots), and Sheba readily learned to associate 
these with the correct number of objects as 
well. 

During this training of Sheba, Boysen and 
Bernston noticed that she often engaged in 
“indicating acts” as she counted. That is, she 
touched, displaced, or “pointed to” objects 
serially in attempting to determine the ap- 
propriate Arabic numeral — much as human 
children touch or otherwise indicate objects 
as they count them. In a follow-up study, 


REASONING AND THINKING IN NONHUMAN PRIMATES 61 wh 


thereford BAYeat ae dy Gi itty | Hesewelaneeene oO ainderstood as a mental operation, however, 


whether the number of indicating acts Sheba 
used as she engaged in these tasks correlated 
with the number of items in the array (by the 
time of this study, Sheba knew the numer 
als o-7). They gave her some counting tasks, 
using the numerals o to 7, and found that 
she correctly counted 5 4% of the time (with 
errors distributed equally across the 1 to 7 
range). They found further that whereas the 
absolute number of Sheba’s indicating acts 
did not correspond to the number of items 
in the array precisely, typically being about 
twice as large, these did correlate signifi- 
cantly (r = .74). It is unclear whether this 
correlation is attributable to counting or to 
the fact that determining the numerosity of 
the larger numerals requires more time, so 
that a constant rate of indicating acts across 
all numerals would lead to the correlation. 
In any case, the investigators concluded that 
Sheba was counting objects in much the 
same way as human children, and that her in- 
dicating acts were serving a mediating func- 
tion in the process. 


Summation and Subtraction 


Rumbaugh and colleagues (Perusse & 
Rumbaugh, 1990; Rumbaugh, Savage- 
Rumbaugh, & Hegel, 1987; Rumbaugh, 
Savage-Rumbaugh, & Pate, 1988) pre- 
sented two language-trained chimpanzees 
(Sherman & Austin) with two unequal 
sets of candies (M&M) but presented as 
spatially distinct subsets. For instance, a 
trial consisted of presenting four and three 
candies compared with five and one candies. 
Chimpanzees were capable of comparing 
these two sets and combining the spatially 
distinct subsets (e.g., 4 +3 vs. 5 +1) to net 
the larger total array (up to a maximum of 
seven candies) on more than 90% of the 
trials. Although the investigators did not 
claim that their subjects “added” numbers 
in anything resembling the human method, 
they argued that the skills required for this 
task go beyond simple subitizing, because 
the items in each of the two quantities to be 
compared are separated into two spatially 
distinct subsets. This is far from summation 


because subjects were not required to 
perform any mental operations. Directly 
perceiving the larger of the two overall 
quantities in either side would suffice to 
solve this task. In other words, this can be 
seen as a relative numerousness judgment 
over a large area, without any operation 
beyond perception being implicated. 

In an attempt to solve this problem, Call 
(2000) presented orangutans with two quan- 
tities in two dishes and then added a third 
one into one of the dishes. In some trials, 
this resulted in the smaller of the two ini- 
tial quantities having more and sometimes 
did not change. Call (2000) also subtracted 
quantities from the initial quantities and 
showed how much he had subtracted. The 
important point is that subjects never saw 
the final quantities directly, but they had 
to decide based on how much had been 
added to or subtracted from the quanti- 
ties. Orangutans were capable of perform- 
ing above chance in both addition and 
subtraction. 

Sulkowski and Hauser (2001) also investi- 
gated subtraction in rhesus macaques. They 
showed subjects two quantities (up to three 
items each), hid each of them in two sep- 
arate adjacent locations and then removed 
either one or no items from each location. 
Rhesus macaques selected the location with 
more items even when subtractions occurred 
from both locations and when some noned- 
ible items rather than food were subtracted. 

Beran (2001) found that two chim- 
panzees were also able to add quantities 
presented sequentially up to nine pieces of 
candy (M&M). Unlike the previous study, 
each piece of candy was added individually 
to one of two cups rather than presenting the 
array in its totality. In different experiments, 
subjects witnessed the experimenter plac- 
ing different quantities into the cups in var- 
ious steps. All candies may be added at one 
time in a given cup or the addition rounds 
could be alternated between the two cups. 
In the final experiment, subjects witnessed 
the experimenter removing one candy from 
one of the cups before being allowed 
to choose. Both chimpanzees performed 
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only one chimpanzee was above chance 
in the subtraction trials. These two stud- 
ies suggest that orangutans and chimpanzees 
can represent quantities and mentally op- 
erate on those quantities to net the largest 
array. 

Olthof, Iden, and Roberts (1997) raised 
the bar a bit further and replaced the ac- 
tual quantities by Arabic numerals. First the 
monkeys were trained to identify the numer- 
als corresponding to quantities that subjects 
knew. Then subjects were given a choice be- 
tween different combinations of numerals 
involving two numerals in each pair, one nu- 
meral and two numerals, or three numerals 
in each pair. For instance (1 + 1 + 3) against 
(2 +2 +2). Squirrel monkeys were capable 
of selecting the larger total quantity. Addi- 
tional tests indicated that this effect could 
not be explained by choosing the largest 
numeral available or avoiding the smallest 
number available. 

Similarly, Boysen and Berntson (1989) 
also reported that Sheba was able to visit 
three locations in the room, look for hid- 
den items that might be there and report 
the total number of items (up to four) at a 
different location by picking up a card de- 
picting the arabic numeral corresponding to 
the total number of items available in the 
room. Sheba was able to do this by using ei- 
ther actual objects or Arabic numerals with 
an overall accuracy of 75% (chance = 25%). 
Given that Sheba also can make transitive 
inferences with an ordered series of items 
(Boysen et al., 1993) and uses indicating acts 
as she attempts to determine the numeri- 
cal value of sets of objects (see previous sec- 
tion), the investigators hypothesize that she 
is actually counting, in a human-like way, 
in these foraging tasks, and that her number 
concept is very much like that of a young 


human child. 


Conservation 


Piaget and Inhelder (1941) considered the 
ability to understand that physical quanti- 
ties remain constant after changing their per- 
ceptual appearance an important step to- 


children. Seven-year-old children and older 
understand that if two quantities were the 
same prior to a perceptual transformation 
(and nothing has been added or removed) 
they must be the same after the transforma- 
tion has taken place. This logical necessity is 
the cornerstone of the conservation experi- 
ments (see also Halford, Chap. 22). 

Although some studies on conservation 
in monkeys have been done (e.g., Thomas 
& Peay, 1976), the lack of information 
about how subjects judge the two quanti- 
ties prior to a transformation prevents us 
from drawing any conclusions. Two stud- 
ies with chimpanzees collected this informa- 
tion and therefore can be interpreted more 
accurately. First, Woodruff, Premack, and 
Kennel (1978) presented Sarah with liquid, 
solid, and number conservation tasks. Before 
this test, Sarah had learned to use plastic 
tokens to indicate whether a pair of stim- 
uli were “same” or “different.” In the liquid 
conservation task, she was presented with a 
pair of equal or unequal liquid quantities 
in identical containers and asked to judge 
them with the tokens. One of the quan- 
tities was then poured into another con- 
tainer with a different shape, and she was 
asked to make a judgment on the novel 
stimuli. Results indicated that she correctly 
judged the quantities in liquid and solid 
but not in number conservation tasks. Addi- 
tional tests also showed that Sarah was un- 
able to judge correctly when she was pre- 
vented from seeing the quantities presented 
in identical containers first. This result led 
Woodruff, Premack, & Kennel (1978) to con- 
clude that she based her judgments on log- 
ical necessity rather than perceptual esti- 
mation. Similarly, Muncer (1983) reported 
that a chimpanzee was capable of selecting 
the larger of two quantities after applying a 
transformation that changed the appearance 
of the liquid. As in the previous study, the 
chimpanzee was unable to select the larger 
quantity if she was prevented from seeing 
the pretransformation quantities displayed 
in identical containers. 

Call and Rochat (1996) investigated the 
ability of four orangutans to solve liquid 
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Muncer’s procedure. Subjects were pre- 
sented with a pair of identical containers 
with unequal amounts of juice. Once sub- 
jects had indicated their choice by pointing 
(which invariably was to the larger quan- 
tity), the experimenter transferred the liquid 
quantities into a pair of unequal containers. 
In different experiments, the authors var- 
ied either the shape of the containers or the 
number of containers available (while keep- 
ing the shape constant). Although some apes 
still selected the larger quantity after a shape 
transformation, this performance deterio- 
rated when the contrast between the shape 
of the containers was increased. In contrast, 
some of the six- to eight-year-old children 
they tested performed satisfactorily. Further- 
more, none of the apes solved the task when 
the quantities were transferred into mul- 
tiple containers. Call and Rochat (1996) 
concluded that orangutans depended upon 
perceptual information rather than logical 
necessity, thereby demonstrating “pseudo- 
conservation.” In a follow-up study, Call and 
Rochat (1997) investigated the use of per- 
ceptual strategies underlying the orangutans’ 
pseudoconservation. The authors examined 
three possible perceptual strategies to iden- 
tify the larger amount of liquid: visual es- 
timation of the liquid in the container, the 
use of information about quantity based 
on pouring the liquid, and a tracking strat- 
egy that consisted of following the lig- 
uid that subjects had initially chosen. Re- 
sults indicated that the visual estimation 
strategy best accounted for the orangutan’s 
pseudoconservation. Overall, these investi- 
gators interpreted their results as indicating 
that orangutans are very good at estimating 
quantities and at tracking the quantity they 
prefer across various spatial displacements, 
but they do not conserve quantities across 
perceptual transformations in a humanlike 
manner. 

The studies cited with chimpanzees sug- 
gest the use of logical reasoning, whereas 
studies with orangutans suggest the use of 
perceptual estimation in the solution of liq- 
uid conservation problems. Because both 
the species and methods employed in the 


cult to know whether chimpanzees and 
orangutans truly differ in the mechanisms 
they use to solve conservation problems or 
the differences were a result of the methods 
used in each set of studies. Recently, Suda 
and Call (in press) set out to resolve this 
discrepancy by studying chimpanzees and 
orangutans with the same procedures. They 
presented apes with various liquid conser- 
vation problems in which the initial quanti- 
ties were transferred into containers of dif- 
ferent shapes or into multiple containers, di- 
viding the total quantity. Results supported 
the notion that most apes relied on percep- 
tual estimation rather than logical necessity 
with orangutans being slightly more profi- 
cient than chimpanzees. 

In summary, primates can solve quanti- 
tative problems that require combining or 
dissociating quantities, and they can develop 
the notion of ordinality. In contrast, there is 
little evidence that primates use logical ne- 
cessity when confronted with various Piage- 
tian conservation problems. 


Causal Reasoning 


Causal reasoning is a complex topic, and 
much hinges on the chosen definition of 
causality. Some researchers interpret causal- 
ity as the ability to form stimulus—stimulus 
or stimulus-response associations. In this 
broad sense there is no doubt that many an- 
imals are sensitive to causality. We concen- 
trate more narrowly on the understanding of 
the underlying “structures” and “forces” that 
are responsible for certain effects. This has 
been most studied in the domain of tool use, 
but it has also been investigated in a variety 
of other types of physical events in which 
the subject does not manipulate but only 
observes. 


Tool Use 


Many introductory texts to psychology men- 
tion the experiments involving tool use 
by chimpanzees as groundbreaking stud- 
ies. Since then, it has been shown that 
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riety of ways and for a variety of pur 
poses (see Beck, 1980; Tomasello & Call, 
1997, for reviews). We concentrate on three 
tasks that have been used to investigate 
causality. 


SUPPORT PROBLEM 


In this problem, a reward is placed on a 
cloth. The reward itself is outside the sub- 
ject’s reach, but one of the ends of the cloth 
is within reach. The solution to this problem 
consists of pulling the cloth to bring the re- 
ward within reach. Piaget (1952) studied this 
problem in human infants and indicated that 
by 12 months of age children not only readily 
pull in the cloth but, more importantly, they 
withhold pulling when the reward is not in 
contact with the cloth. This indicates that 
children at this age understand that spatial 
contact is necessary for the tool to act on 
the reward. 

Spinozzi and Poti (1989) tested several in- 
fant primates (one Japanese macaque, two 
capuchin monkeys, two longtail macaques, 
and one gorilla) on this problem. In one con- 
dition, the reward was placed on the cloth, 
whereas in another condition the reward was 
placed off the cloth to the side. All primates 
responded appropriately by pulling in the 
cloth when the reward was on the cloth and 
withheld pulling when the reward was off 
the cloth. In a second experiment, Spinozzi 
and Poti (1989) tested the generality of these 
findings by modifying the conditions of the 
off-cloth condition by placing the reward 
near the end of the cloth rather than to 
the side of it. The authors reasoned that if 
subjects had simply learned to respond ap- 
propriately to a specific configuration of the 
cloth and the reward rather than a more 
general relation between them, they would 
respond inappropriately to this novel con- 
figuration. Results confirmed their previous 
findings: All subjects pulled in the on-cloth 
condition but not in the off-cloth condition. 
Recently, Spinozzi and Poti (1993) admin- 
istered the same support problem to two 
infant chimpanzees and only one of them 
succeeded. 


gated this problem in detail with cotton-top 
tamarins (Hauser, Kralik & Botto-Mahan, 
1999). Their studies questioned whether 
these monkeys can distinguish between rele- 
vant and irrelevant features of tools — in this 
case, the cloth. They found that tamarins 
were able to master this problem. In par- 
ticular, presented with two cloths and two 
rewards from which to choose, they pulled 
the cloth on which the reward rested. In 
another experiment, subjects selected the 
cloth that was connected somehow to the 
reward but avoided the cloth that was not 
connected to the reward. Once subjects had 
mastered these two problems, the authors 
presented monkeys with both relevant and 
irrelevant changes to the problem. Relevant 
changes included the position of the re- 
ward in relation to the reward or the con- 
nectedness between two pieces of cloth; ir- 
relevant changes included variations in the 
color, texture, or shape of the cloth. The 
tamarins ignored irrelevant changes to the 
tool such as color or shape. They failed to 
solve some problems involving changes in 
the relevant features, although they mas- 
tered those problems with additional expe- 
rience. The authors interpreted this as an 
ability to distinguish between relevant and 
irrelevant features. 

Hauser et al. (2002) investigated whether 
experience with tool use played a role in de- 
ciding what constituted the relevant func- 
tional features of a tool. They presented 
monkeys with a number of cloth problems 
that varied along several parameters except 
that the correct alternative was always in- 
dicated by the same color. Subjects there- 
fore could solve the various problems by 
either attending to the relevant features of 
the problem (e.g., connectedness) or the 
color of the cloth. Once animals had mas- 
tered the series of preliminary tests, they 
were presented with novel problems but 
with the color contingency reversed so that 
color always signaled the incorrect alterna- 
tive. Results indicated that tool-experienced 
monkeys relied less on irrelevant cues such 
as color than tool-naive individuals in solv- 
ing the cloth problem. Nevertheless, all 
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ment performance, albeit performance of 
tool-experienced monkeys suffered less. 


STICK AND HOOK PROBLEM 


A more challenging task than the sup- 
port problem consists of using a tool to 
bring in a reward that is not in direct con- 
tact with the tool. This situation entails 
putting the tool into contact with the re- 
ward and then sweeping the reward within 
reach. According to Natale (1989), solving 
this task demonstrates an ability to under 
stand complex causal relations such as that 
the stick must be of the appropriate size 
and material (e.g., long and rigid) and that 
only certain kinds of contact (e.g., with a 
certain force and directionality) would be 
successful. 

Natale (1989) presented eight subjects 
from the same four species tested by 
Spinozzi and Poti (1989) with an out-of- 
reach reward and a stick placed in differ- 
ent positions relative to the object in dif- 
ferent experimental conditions. Three of 
the four capuchin monkeys and the go- 
rilla were moderately successful in obtaining 
the reward in various tool-reward spatial ar- 
rangements. These results have been con- 
firmed by other studies (see Beck, 1980; 
Tomasello & Call, 1997 for a review). Al- 
though none of the macaques tested by 
Natale (1989) was able to obtain the re- 
ward with the stick, other studies have 
shown that macaques and other primates, 
including baboons, orangutans, and chim- 
panzees, are capable of solving the stick 
problem (see Tomasello & Call, 1997 for a 
review). 

A refinement of the stick problem con- 
sists of presenting a hook-shaped tool and 
a straight tool as alternatives for retrieving 
the reward. Hauser (1997) presented cotton- 
top tamarins with two hooked tools, only 
one of which had the reward inside the 
hook so that pulling it would bring the re- 
ward. Once tamarins consistently solved this 
problem — that is, they preferred the stick 
with the reward inside the hook — the au- 
thors presented novel problems in which 


tures of the task, as previously done with 
the cloth problem. Results mirrored those 
of the cloth problem and indicated that in- 
dividuals selected tools most often based on 
relevant, as opposed to irrelevant, functional 
features. Hauser, Pearson, and Seelig (2002) 
recently investigated the role of experience 
in the ability to distinguish relevant from 
irrelevant features. They found that infant 
tamarins, without much experience with 
tools, also selected tools based on relevant 
features, reproducing the results of the adult 
subjects. 


TUBE AND TRAP PROBLEM 


In this problem, the reward is placed inside 
the middle portion of a transparent tube, 
and subjects have to use a stick to push the 
reward out the end opposite to which the 
reward was inserted. In a series of studies, 
Visalberghi and colleagues explored the abil- 
ity of capuchin monkeys and apes (mainly 
chimpanzees) to solve this problem and to 
adjust to novel variations of this problem. 
Visalberghi and Trinca (1989) found that 
three of four capuchin monkeys succeeded 
in the basic version of the problem, and then 
the authors administered three variations 
of the problem involving different types of 
tools that required different solutions. In the 
bundle task, subjects were given a bundle 
of sticks taped together that, as a whole, 
was too wide to fit in the tube; the solu- 
tion consisted of breaking the sticks apart. 
In the short-sticks task, subjects were given 
three short sticks that, together, added up 
to the length required; the solution consist- 
ing of putting them all in the same end of 
the tube to displace the food out the other 
side. Finally, in the H-tool task, subjects were 
given a stick with transverse pieces on ei- 
ther end that prevented its insertion into the 
tube; the solution consisted of removing the 
blocking piece from the tool. Although all 
three subjects eventually solved these varia- 
tions of the task, they made a number of er- 
rors such as attempting to insert the whole 
bundle or inserting one short stick in one 
end of the tube and another short stick in 
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not decrease significantly over tials suggest- 
ing that capuchins understood little about 
the causal relations between the elements in 
the task. Visalberghi, Fragaszy, and Savage- 
Rumbaugh (1995) essentially replicated the 
results of the bundle and the H-tool tasks 
with six other capuchins. 

One recent study, however, suggests that 
some capuchins may understand more about 
causal relations than previously thought. 
Anderson and Henneman (i995) tested the 
ability of two adult capuchin monkeys to an- 
ticipate (and solve) a variety of problems as- 
sociated with using a stick to extract honey 
from a box with multiple holes. In a se- 
ries of experiments of increasing complex- 
ity, subjects were required to select a stick 
of the appropriate diameter to fit the holes, 
rake in a stick of the appropriate diameter 
with the help of another tool, modify a stick 
that was too thick or too twisted to fit the 
holes, or construct a rake that would per- 
mit them to obtain a suitable stick to ex- 
tract the honey. Results indicated that both 
capuchins (especially the male) readily se- 
lected sticks of a diameter suitable to fit the 
holes. This even included cases in which the 
box and the sticks available were not within 
the same visual field. This result contrasts 
with Visalberghi’s (1993) findings in which 
capuchins failed to select appropriate tools 
to solve the tube task when the tools were 
left in a room adjacent to the tube with food 
in it. Moreover, Anderson and Henneman 
(1995) noted that one capuchin modified 
tools in a very purposeful manner without 
committing the sort of errors described by 
Visalberghi and Trinca (1989). The same ca- 
puchin also used a tool (itself not suitable for 
honey-dipping) to rake in appropriate sticks 
for honey-dipping. Neither of the subjects, 
however, was able to construct a rake to ob- 
tain honey-dipping sticks. 

The tube task has also been administered 
to apes. First, Bard, Fragaszy, and Visalberghi 
(1995) administered this task to young chim- 
panzees (two to four years old) and found 
that in the two most difficult versions of the 
task (i.e., short-sticks and H-tool), the per 
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ally deteriorated over trials, indicating they 
may not have come to understand the causal 
relations involved, although their young age 
may have explained their poor performance. 
Visalberghi Fragaszy, & Savage-Rumbaugh 
(1995) presented the bundle and H-tool 
tasks to subadult and adult apes (four bono- 
bos, five chimpanzees, and one orangutan). 
Eight of the ten apes solved the basic tube 
task on the first trial, and the other two 
were successful later. When given a bun- 
dle of sticks, all subjects immediately disas- 
sembled the bundle and, unlike capuchins, 
no ape attempted to insert the bundle as 
a whole. Apes proved less successful in the 
H-tool task, however, making some of the 
same mistakes as the capuchins. Indeed, a 
statistical comparison of the two species in 
this condition revealed no significant differ- 
ence. Although there was an overall group 
tendency to decrease the number of errors 
across trials, some subjects increased their 
errors. 

To examine further the understanding of 
causal relations in the tube task, Visalberghi 
and Limongelli (1994) presented a new tube 
problem that punished subjects who did not 
foresee the consequences of their behavior. 
The authors presented four capuchin mon- 
keys with a tube that had a trap in its bottom 
center, and placed the food next to the trap. 
If subjects pushed the food in the direction 
of the trap, it would fall in it and they would 
lose it; to get the food out, they had to push 
the food away from the trap toward the other 
end of the tube. Visalberghi and Limongelli 
(1994) found that only one subject solved 
the task, systematically pushing the reward 
away from the trap. Although this subject 
seemed to be planning her moves in advance, 
the authors noted that in half the trials, she 
inserted the tool in the wrong side of the 
tube and, upon seeing that the reward was 
moving into the trap, withdrew the tool, 
reinserted it in the other end, and pushed 
out the reward. Visalberghi and Limongelli 
(1994) probed further her understanding of 
the relation between the trap and the reward 
by inverting the trap 180 degrees so that the 
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no longer effective. The subject, however, 
persisted in her strategy of pushing the food 
away from the trap, suggesting that she had 
apparently simply learned to push the food 
away from the trap side without understand- 
ing the causal relations between the trap and 
the reward. 

Limongelli, Boysen, and Visalberghi 
(1995) presented the trap-tube task to five 
chimpanzees who behaved at chance levels 
for the first seventy trials, although two of 
them learned to avoid the trap during sev- 
enty additional trials. The authors admin- 
istered an additional test to assess whether 
chimpanzees understood the relationship 
between the position of the reward with re- 
spect to the trap or whether they were sim- 
ply using the simple rule of pushing the re- 
ward out the side to which it was closest, 
thus avoiding the trap. Limongelli, Boysen, 
and Visalberghi (1995) varied the location 
of the trap in the tube. In some cases, the 
trap was located very close to one end with 
the food just beyond it, so that subjects ac- 
tually had to push the food out the end from 
which it was farthest. In other cases, the 
opposite arrangement was used. Both sub- 
jects solved these variations easily, with al- 
most no errors, so the researchers concluded 
that these two chimpanzees understood the 
causal relations in this task better than the 
capuchin monkeys. It should be noted, how- 
ever, that the variations used in this experi- 
ment could still be solved by the rule “push 
the food away from the trap,” which could 
have been learned during the previous tri- 
als. Unfortunately, the authors did not in- 
vert the trap as was previously done with 
capuchins. 

In summary, this section has shown 
that various primates have some knowledge 
about causal relations regarding what makes 
a tool effective. They know that objects have 
to be in contact for a tool to be effective, rec- 
ognize the relevant and irrelevant functional 
features of a tool, and can choose the ap- 
propriate dimensions of an effective tool in 
a particular task. Nevertheless, these studies 
have also shown clear limitations, perhaps 


variations. 


Perceiving and Judging Physical Events 


One area that has received considerable at- 
tention is that of object knowledge in in- 
fants. These studies present subjects with a 
series of events — some that follow the laws 
of physics such as solidity or gravity and 
others that violate those laws. Using look- 
ing measures, numerous studies have found 
that human infants respond selectively to 
the violation of physics laws (Baillargeon, 
1995; Spelke et al., 1995). These authors 
have argued that even at this young age, chil- 
dren show object knowledge. Hauser and 
colleagues have been instrumental in intro- 
ducing this area of research in nonhuman 
primates. They have concentrated on two 
topics: gravity and solidity. 

In the gravity area, Hood et al. (1999) pre- 
sented cotton-top tamarins with three con- 
tainers arranged in a straight line. One of the 
containers was connected to an opaque tube 
through which the experimenter dropped 
food. Subjects consistently searched for the 
food in the container over which the food 
was dropped. They did this regardless of 
whether the tube was connected to the 
container or not. This indicates that mon- 
keys failed to understand that the reward’s 
straight-fall trajectory can be deviated by the 
tube. This bias persisted despite variations 
on the incentives offered to the subjects for 
successful performance. Children presented 
with the same task also show a gravity bias, 
although older children can eventually over- 
come it (Hood, Care, & Prasada, 2000). Ina 
follow-up experiment, Hauser et al. (2001) 
reported that when the reward trajectory 
was horizontal rather than vertical (as in the 
original test), tamarins performed better and 
the biases observed previously disappeared. 
Also subjects with experience with the hor 
izontal version of the task performed bet- 
ter in the original task (i.e., free-fall reward) 
than subjects without such experience, even 
though the gravity bias was still apparent. 
Taken together, these results suggest that 
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impairs their search for hidden objects. 

Hauser (2001) investigated the same 
topic with a different paradigm. He pre- 
sented rhesus macaques with a table and two 
boxes. The first box was placed on top of the 
table and the second box was placed under 
the table right under the first box. The exper- 
imenter then raised a screen that occluded 
both boxes (and the table) and dropped a re- 
ward over the top box. Because of the screen, 
the monkeys never saw where the reward 
entered the box, they just saw it falling to- 
ward the table and disappearing behind the 
screen. Monkeys searched for the food in 
the bottom box, thus showing a gravity bias. 
Control tests indicated that subjects did not 
have a preference for the bottom box, nor 
did they avoid the top box in the absence 
of the reward drop. Interestingly, Santos and 
Hauser (2002) found that rhesus monkeys 
tested with the same paradigm but using a 
violation of expectation measures solved this 
problem. In other words, subjects looked 
longer in trials in which the reward appeared 
on the bottom (apparently going through a 
solid partition) than in trials in which the 
reward stayed on top of the partition. 

Call (2004) recently investigated two 
other aspects of the object knowledge that 
subjects may use to find food. The first is 
whether apes know that food inside a con- 
tainer when shaken makes a noise. He found 
that apes are capable of using the noise 
made by shaking food to identify the correct 
container (see inferential reasoning section). 
Although one may argue that this simply 
involves detecting an association between 
the food and the cue rather than an un- 
derstanding that the food causes the noise, 
there are several lines of evidence that sug- 
gest that this interpretation oversimplifies 
the phenomenon. First, subjects performed 
well from the beginning, with no evidence 
of gradual improvement over trials. If sub- 
jects had learned to associate a noise with 
food in the past, it is unclear why, in control 
tests, they failed to associate a noise made by 
tapping the baited cup, which was compara- 
ble to that made by shaking the food inside 
the cup, with the presence of food. This fail- 


were tested after they had solved the initial 
problem. Second, their performance in this 
tapping test was comparable to performance 
in learning novel stimuli with arbitrary rela- 
tions — for instance, learning that a green cup 
has food and a yellow cup does not. Sub- 
jects responded correctly to the auditory cue 
when it held a causal connection to the food 
but failed to do so when the auditory cue 
held a noncausal connection to the food. 

In a second study, Call (unpublished data) 
investigated the ability of apes to use the 
shape of objects to locate food. In the initial 
problem, he presented two rectangular trays 
on a platform and hid a piece of food un- 
der one of them. One of the trays therefore 
rested flat on the platform whereas the other 
rested in an inclined orientation (because of 
the food placed under it). Subjects selected 
preferentially the inclined tray but failed to 
do so in a control test in which the inclined 
tray was substituted by a wooden wedge 
that produced the same visual effect as the 
inclined tray. This result was important be- 
cause it ruled out the possibility that subjects 
simply preferred the perceptual appearance 
of the inclined tray, perhaps because it had 
been reinforced in the past. More impor- 
tantly, subjects failed to select the wedge, 
an arbitrary stimulus, after they were suc- 
cessful in the inclined tray test. This result 
is analogous to that of the previous study 
in which subjects failed to respond above 
chance to stimuli with noncausal connec- 
tions to the food after they had succeeded 
with very similar stimuli with causal connec- 
tions. It therefore is found again that when 
there are arbitrary (i.e., noncausal) relations 
between the food and the elements of the 
problem, subjects perform poorly compared 
with when the connection is nonarbitrary 
(i.e., causal). It is unlikely that these results 
are based solely on learning to associate a 
cue with a response without any insight into 
the structure of the problem. Instead, it is 
conceivable that subjects understood that it 
was the food that caused the noise or made 
the tray incline, not simply that the food was 
associated with the presence of the noise or 
the shape. 
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edge about the physical properties of 
objects, and they can use this knowledge to 
predict the location in which rewards can be 


found. 


Social Reasoning 


Primates’ social cognition represents a large 
area of research in its own right. At the 
most basic level, it involves how individu- 
als understand and predict the behavior and, 
perhaps, the perceptual activities of others. 
Of course, it also may involve how indi- 
viduals understand the psychological states 
and activities of others, which are less di- 
rectly observable. So the question is whether 
primates can reason about the psycholog- 
ical states and activities of others. Despite 
many richly interpreted anecdotes, until re- 
cently there was very little evidence that 
primates reasoned about what others were 
seeing, intending, wanting, and thinking (see 
Tomasello & Call, 1997, for a review). Some 
recent studies have demonstrated that pri- 
mates can reason about some — although 
clearly not all — of the psychological states 
about which humans reason. 

Hare et al. (2000) placed a subordinate 
and a dominant chimpanzee into rooms on 
opposite sides of a third room. Each had a 
guillotine door leading into the third room 
which, when cracked at the bottom, allowed 
them to observe two pieces of food at vari- 
ous locations within that room — and to see 
the other individual looking under her door. 
After the food had been placed, the doors 
for both individuals were opened and they 
were allowed to enter the third room. The 
basic problem for the subordinate in this sit- 
uation is that the dominant will take all of 
the food it can see. In some cases, however, 
things were arranged so that the subordinate 
could see a piece of food that the dominant 
could not see — because it was on her side of 
a small barrier. The question in these cases 
was whether the subordinate knew that the 
dominant could not see a particular piece of 
food, so it was safe for her to go for it. 


did go for food only they could see much 
more often than they went for food that both 
they and the dominant could see. In some 
cases, the subordinate may have been mon- 
itoring the behavior of the dominant, but in 
other cases this possibility was ruled out by 
giving subordinates a small headstart, forcing 
them to make their choice (to go to the food 
that both competitors could see, or to go to 
the food that only they could see) before the 
dominant was released into the area. More- 
over, we ran two other control conditions. In 
one, the dominant’s door was lowered before 
the two competitors were let into the room 
(and again the subordinate got a small head- 
start), so that the subordinate could not see 
which piece the dominant was looking at un- 
der the door (i.e., itis possible that in the first 
studies the subordinate saw that the domi- 
nant was looking at the out-in-the-open food 
and so went for the other piece). The re- 
sults were clear. Subordinates preferentially 
targeted the hidden piece. In the other con- 
trol study, we followed the same basic pro- 
cedure as before (one piece of food in the 
open, one on the subordinate’s side of a bar- 
rier) but we used a transparent barrier that 
did not prevent the dominant from seeing 
the food behind it. In this case, chimpanzees 
chose equally between the two pieces of 
food, seeming to know that the transparent 
barrier was not serving to block the dom- 
inant’s visual access (and so her “control” 
of the food). The findings of these studies 
suggest that chimpanzees know what con- 
specifics can and cannot see and, further, that 
they use this knowledge to make inferences 
about what their competitor is about to do 
next. 

In a follow-up study, Hare, Call, and 
Tomasello (2001) investigated whether 
chimpanzees were also able to take into ac- 
count past information such as whether the 
dominant had seen the baiting. For these 
experiments, two barriers and one piece of 
food were used, and what the dominant 
saw was manipulated. In experimental tri- 
als, dominants had not seen the food hid- 
den, or food they had seen hidden was 
moved to a different location when they 
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they saw the food being hidden or moved. 
Subordinates, on the other hand, always 
saw the entire baiting procedure and could 
monitor the visual access of the dominant 
competitor. Subordinates preferentially re- 
trieved and approached the food that domi- 
nants had not seen hidden or moved, which 
suggests that subordinates were sensitive to 
what dominants had or had not seen during 
baiting a few moments before. In this case, 
deciding which piece of food to approach 
depended on the subordinate’s making infer- 
ences about what the dominant knew about 
the situation. 

These studies of what may be called so- 
cial problem solving demonstrate that some 
primates may make inferential leaps not 
just about directly perceivable things, but 
also about less observable things such as 
what others do and do not see, or even 
have or have not seen in the immediate 
past. 


Conclusions and Future Directions 


There was a time when the dominant view 
in the Western intellectual tradition was that 
human beings were rational and all other an- 
imals were simply preprogrammed brutes or 
automata. That view is demonstrably false. 
All the evidence reviewed in this chapter 
suggests that nonhuman primates interact 
with their worlds in many creative ways, re- 
lying on a variety of cognitive processes to do 
so. They reason and make inferences about 
space, causality, objects, quantities, and the 
psychological states of other individuals, and 
in some cases they can engage in relational 
and analogical reasoning concerning partic- 
ular objects or even categories of objects. 
The main pitfall to avoid in attempting to 
integrate our knowledge about the cognitive 
skills of other animals with our knowledge 
about human cognition is oversimplification. 
Asking dichotomously whether or not ani- 
mals reason or think or have a theory of mind 
generally is not very useful (Tomasello, Call, 
& Hare, 2003a, 2003b). Nonhuman animals 


aspects, of the human version, and in some 
cases they possess skills that humans do not 
have or do not have to the same degree (e.g., 
some of the memory skills demonstrated by 
food-caching birds; Shettleworth, 1998). We 
need to compare the skills in detail if we 
want to provide an anatomy of their struc- 
ture from an evolutionary point of view. 

Because this book is mainly about human 
reasoning and thinking, we should conclude 
with a word about what we believe makes 
human cognition different from that of other 
primates. The answer, in a word, is culture 
(Tomasello & Call, 1997; Tomasello, 1999). 
The thought experiment we use to demon- 
strate the point is to imagine a human child 
raised on a desert island without any social 
contacts. Our contention is that in adult- 
hood this adult’s cognitive skills would not 
differ very much — perhaps a little, but not 
very much — from those of other great apes. 
This person would certainly not invent by 
him or herself a natural language, or algebra 
or calculus, or science or government. The 
human cognitive skills that make the most 
difference are those that enable individu- 
als of the species Homo sapiens, in a sense, 
to pool their cognitive resources — to create 
and participate in collective cultural activ- 
ities and products. When viewed from the 
perspective of the individual mind, the cog- 
nitive skills necessary for cultural creation 
and learning may not differ so very much 
from those of other primate species. 

In any case, much can be learned about 
human cognition by looking at how it is 
similar to and how it is different from that 
of closely related species. We hope to have 
shown in this chapter that, in many fun- 
damental respects, human cognition is sim- 
ply one form of primate cognition. The vast 
gulf that seems to separate what humans and 
other primates can do cognitively — in the do- 
main of mathematics, as just one instance — 
in many, if not most, cases is the result of 
fairly small differences of individual psy- 
chology that enable humans to accumulate 
knowledge across generations and to use col- 
lective artifacts such as linguistic and math- 
ematical symbols. 


REASONING AND THINKING IN NONHUMAN PRIMATES 62 i 


Acknowhrdgenante boy: https /4atitianargntoBovet, D., & Vauclair, J. (2001). Judgment of con- 


The authors wish to thank Keith Holyoak 
for comments made on an earlier draft of 
this chapter. 


References 


Anderson, J. R. & Henneman, M. C. (1995). So- 
lutions to a tool-use problem in a pair of Cebus 
apella. Mammalia, 58, 351-361. 

Baillargeon, R. (1995). Physical reasoning in in- 
fancy. In M. Gazzaniga (Ed.), The Cognitive 
Neurosciences (pp. 181-204) Cambridge, MA: 
MIT Press. 

Bard, K. A., Fragaszy, D. M., & Visalberghi, 
E. (1995). Acquisition and comprehension of 
a tool-using behavior by young chimpanzees 
(Pan troglodytes): Effects of age and modelling. 
International Journal of Comparative Psychol- 
ogy, 8, 47-68. 

Beck, B. B. (1980). Animal Tool Behavior. New 
York: Garland Press. 


Beran, M. J. (2001). Summation and numerous- 
ness judgments of sequentially presented sets of 
items by chimpanzees (Pan troglodytes). Journal 
of Comparative Psychology, 115, 181-191. 

Beran, M. J., & Minahan, M. F. (2000). Moni- 
toring spatial transpositions by bonobos (Pan 
paniscus) and chimpanzees (P. troglodytes). In- 
ternational Journal of Comparative Psychology, 
13,1-15. 

Beran, M. J., & Rumbaugh, D. M. (2001). “Con- 
structive” enumeration by chimpanzees (Pan 
troglodytes) on a computerized task. Animal 
Cognition, 4, 81-89. 

Beran, M. J., & Washburn, D. A. (2002) Chim- 
panzee responding during matching to sample: 
Control by exclusion. Journal of the Experimen- 
tal Analysis of Behavior, 78, 497-508. 

Bernstein, I. S. (1961). The utilization of visual 
cues in dimension-abstracted oddity by pri- 
mates. Journal of Comparative and Physiological 
Psychology, 54, 243-247. 

Boesch, C., & Boesch, H. (1984). Mental map 
in wild chimpanzees: An analysis of hammer 
transports for nut cracking. Primates, 25, 160- 
170. 

Bond, A. B., Kamil, A. C., & Balda, R. P. (2003). 
Social complexity and transitive inference in 
Corvids. Animal Behaviour, 65, 479-487. 


ceptual identity in monkeys. Psychonomic Bul- 
letin and Review, 8, 470-475. 

Boyd, R., & Richerson, P. (1985). Culture and the 
Evolutionary Process. Chicago: The University 
of Chicago Press. 

Boysen, S. T., & Berntson, G. G. (1989). Nu- 
merical competence in a chimpanzee (Pan 
troglodytes). Journal of Comparative Psychology, 
E 03 32 3-3 Ls 

Boysen, S. T., Berntson, G. G., Shreyer, T. A., & 
Hannan, M. B. (1995). Indicating acts during 
counting by a chimpanzee (Pan troglodytes). 
Journal of Comparative Psychology, 109, 
47-51. 

Boysen, S. T., Berntson, G. G., Shreyer, T. A., 
& Quigley, K. S. (1993). Processing of ordi- 
nality and transitivity by chimpanzees (Pan 
troglodytes). Journal of Comparative Psychology, 
107, 208-215. 

Brannon, E. M., & Terrace, H. S. (1998). Ordering 
of the numerosities 1 to 9 by monkeys. Science, 
282, 746-749. 

Brannon, E. M., & Terrace, H. S. (2000). Rep- 
resentation of the numerosities 1-9 by Rhesus 
macaques (Macaca mulatta). Journal of Exper- 
imental Psychology: Animal Behavior Processes, 
26, 31-49. 

Burdyn, L. E., & Thomas, R. K. (1984). Condi- 
tional discrimination with conceptual simulta- 
neous and successive cues in the squirrel mon- 
key (Saimiri sciureus). Journal of Comparative 
Psychology, 98, 405-413. 

Call, J. (2000). Estimating and operating on 
discrete quantities in orangutans (Pongo pyg- 
maeus). Journal of Comparative Psychology, 
114, 136-147. 

Call, J. (2001). Object permanence in orangu- 
tans (Pongo pygmaeus), chimpanzees (Pan 
troglodytes), and children (Homo sapiens). Jour- 
nal of Comparative Psychology, 115, 159- 
1 vA 1. 

Call, J. (2003). Spatial rotations and transposi- 
tions in orangutans (Pongo pygmaeus) and chim- 
panzees (Pan troglodytes). Primates, 44, 347- 
357- 

Call, J. (2004). Inferences about the location of 
food in the great apes. Journal of Comparative 
Psychology, 118, 232-241. 

Call, J., & Rochat, P. (1996). Liquid conservation 
in orangutans. Individual differences and per- 
ceptual strategies Journal of Comparative Psy- 
chology, 100, 219-232. 


628 THE CAMBRIDGE HANDBOOK OF THINKING AND REASONING 


Call, J., & Roba SePRAs Dywpentins: (iAH iBaNary Ines J., Wasserman, E. A. & Young, M. E. 


gies in the estimation of physical quantities by 
orangutans (Pongo pygmaeus). Journal of Com- 
parative Psychology, 111, 315-329. 

Cramer, A. E., & Gallistel, C. R. (2997). Vervet 
monkeys as travelling salesmen. Nature, 387, 
464. 

D’Amato, M. R., & Colombo, M. (1985). Au- 
ditory matching-to-sample in monkeys (Ce- 
bus apella). Animal Learning and Behavior, 13, 
375-382. 

D'Amato, M. R., & Colombo, M. (1988). Rep- 
resentation of serial order in monkeys (Cebus 
apella). Journal of Experimental Psychology: An- 
imal Behavior Processes, 14, 131-139. 

D’Amato, M. R., & Colombo, M. (1989). Serial 
learning with wild card items by monkeys (Ce- 
bus apella): Implications for knowledge of ordi- 
nal position. Journal of Comparative Psychology, 
103, 252-261. 

D’Amato, M. R., & Salmon, D. P. (1984). Cog- 
nitive processes in cebus monkeys. In H. L. 
Roitblat, T. G. Bever, & H. S. Terrace (Eds.), 
Animal Cognition (pp. 149-168). Hillsdale, NJ: 
Erlbaum. 


D’Amato, M. R., Salmon, D. P., Loukas, E., & 
Tomie, A. (1985). Symmetry and transitiv- 
ity of conditional relations in monkeys (Cebus 
apella) and pigeons (Columba livia). Journal of 
the Experimental Analysis of Behavior, 44, 35- 
47- 

D'Amato, M. R., Salmon, D. P., Loukas, E., & 
Tomie, A. (1986). Processing of identity and 
conditional relations in monkeys (Cebus apella) 
and pigeons (Columba livia). Animal Learning 
and Behavior, 14, 365-373. 

De Blois, S. T., & Novak, M. A. (1994). Object 
permanence in rhesus monkeys (Macaca mu- 
latta). Journal of Comparative Psychology, 1 08, 
318-327. 

De Blois, S. T., Novak, M. A., & Bond, M. (1998). 
Object permanence in orangutans (Pongo pyg- 
maeus) and squirrel monkeys (Saimiri sci- 
ureus). Journal of Comparative Psychology, 112, 
137-152. 

DeLoache, J. S. (1995). Early understanding and 
use of symbols: The model model. Currents Di- 
rections in Psychological Science, 4, 109-113. 

Dumas, C. & Brunet, C. (1994). Pernamence de 
l'objet chez le singe capucin (Cebus apella): 
Etude des desplacements invisibles. Revue 
Canadienne de psychologie experimentale, 48, 


341-357. 


(2001). Discriminating the relation between 
relations: The role of entropy in abstract con- 
ceptualization by baboons (Papio papio) and 
humans (Homo sapiens). Journal of Experimen- 
tal Psychology: Animal Behavior Processes, 27, 
316-328. 

Fujita, K. (1982). An analysis of stimulus con- 
trol in two-color matching-to-sample behav- 
iors of Japanese monkeys (Macaca fuscata fus- 
cata). Japanese Psychological Research, 24, 124- 
135. 

Fujita, K. (2983). Formation of the sameness- 
difference concept by Japanese monkeys from 
a small number of color stimuli. Journal of 
the Experimental Analysis of Behavior, 40, 289- 
300. 


Garber, P. (1989). Role of spatial memory in pri- 
mate foraging patterns: Saguinus mystax and 
Saguinus fuscicollis. American Journal of Prima- 
tology, 19, 203-216. 


Gillan, D. J. (1981). Reasoning in the chimpanzee: 
II. Transitive inference. Journal of Experimental 
Psychology: Animal Behavior Processes, 7, 150- 
164. 

Gillan, D. J., Premack, D., & Woodruff, G. (1981). 
Reasoning in the chimpanzee: I. Analogical rea- 
soning. Journal of Experimental Psychology: An- 
imal Behavior Processes, 7, 1-17. 


Hare, B., Call, J., Agnetta, B., & Tomasello, M., 
(2000). Chimpanzees know what conspecifics 
do and do not see. Animal Behaviour, 59, 771- 
785. 

Hare, B., Call, J, & Tomasello, M., (2001). Do 
chimpanzees know what conspecifics know 
and do not know? Animal Behaviour, 61 , 139- 
151. 

Harlow, H. F. (1959). The development of learn- 
ing in the rhesus monkey. American Scientist, 
47, 459-479. 

Hashiya, K., & Kojima, S. (2001). Hearing and 
auditory-visual intermodal recognition in the 
chimpanzee. In T. Matsuzawa (Ed.). Primate 
Origins of Human Cognition and Behavior 
(pp. 155-189). Berlin: Springer-Verlag. 

Hauser, M. D., Kralik, J., & Botto-Mahan, C. 
(1999). Problem solving and functional design 
features: Experiments on cotton-top tamarins, 
Saguinus oedipus. Animal Behaviour, 57, 565- 
582. 

Hauser, M., Pearson, H., & Seelig, D. (2002). 
Ontogeny of tool use in cotton-top tamarins, 
Saguinus oedipus: Innate recognition of 


REASONING AND THINKING IN NONHUMAN PRIMATES 629 


functiomibeska eb yr https /iGaiianazaco Ming, J. E. (1973). Learning and generalization 


haviour, 64, 299-311. 

Hauser, M. D. (1997). Artifactual kinds and func- 
tional design features: What a primate under- 
stands without language. Cognition, 64, 285- 
308. 

Hauser, M. D. (2001). Searching for food in 
the wild: A nonhuman primate’s expectations 
about invisible displacement. Developmental 
Science, 4, 84-93. 

Hauser, M. D., Santos, L. R., Spaepen, G. M., & 
Pearson, H. E. (2002). Problem solving, inhi- 
bition and domain-specific experience: Exper- 
iments on cotton-top tamarins, Saguinus oedi- 
pus. Animal Behaviour, 64, 387-396. 

Hauser, M. D., Williams, T., Kralik, J. D, & 
Moskovitz, D. (2001). What guides a search 
for food that has disappeared? Experiments 
on cotton-top tamarins (Saguinus oedipus). 
Journal of Comparative Psychology, 115, 140- 
151. 

Hood, B., Carey, S. & Prasada, S. (2000). Pre- 
dicting the outcomes of physical events: Two- 
year-olds fail to reveal knowledge of solidity 
and support. Child Development, 71, 1540- 
1554: 

Hood, B. M., Hauser, M. D., Anderson, L., & 
Santos, L. (1999). Gravity biases in a non- 
human primate? Developmental Science, 2, 35- 
41. 

Iversen, I. H., & Matsuzawa, T. (2001). Ac- 
quisition of navigation by chimpanzees (Pan 
troglodytes) in an automated fingermaze task. 
Animal Cognition, 4, 179-192. 

Jackson, W. J., & Pegram, G. V. (1970a). Ac- 
quisition, transfer and retention of matching 
by Rhesus monkeys. Psychological Reports, 27, 
839-846. 

Jackson, W. J., & Pegram, G. V. (1970b). Com- 
parison of intra- vs. extradimensional transfer 
of matching by Rhesus monkeys. Psychonomic 
Science, 19, 162-163. 

Kawai, N., & Matsuzawa, T. (2000). Numerical 
memory span in a chimpanzee. Nature, 403, 
39-40. 

King, J. E., & Fobes, J. L. (4975). Hypothesis anal- 
ysis of sameness-difference learning-set by ca- 
puchin monkeys. Learning and Motivation, 6, 
101-113. 

King, J. E., & Fobes, J. L. (1982). Complex learn- 
ing by primates. In J. L. Fobes, & J. E. King 
(Eds.), Primate Behavior. (pp. 327-360). New 
York: Academic Press. 


of a two-dimensional sameness-difference con- 
cept by chimpanzees and orangutans. Journal of 
Comparative and Physiological Psychology, 84, 
140-148. 

KGhler, W. (1925). The Mentality of Apes. London: 
Routledge & Kegan Paul. 

Kojima, T. (1979). Discriminative stimulus con- 
text in matching-to-sample of Japanese mon- 
keys. Japanese Psychological Research, 21, 189- 
194. 

Kuhlmeier, V. A., Boysen, S. T., & Mukobi, K. L. 
(1999). Scale-model comprehension by chim- 
panzees (Pan troglodytes). Journal of Compara- 
tive Psychology, 113, 396-402. 

Kuhlmeier, V. A., & Boysen, S. T. (2001). 
The effect of response contingencies on scale 
model task performance by chimpanzees (Pan 
troglodytes). Journal of Comparative Psychology, 
115, 300-306. 

Limongelli, L., Boysen, S. T., & Visalberghi, E. 
(2995). Comprehension of cause-effect rela- 
tions in a tool-using task by chimpanzees (Pan 
troglodytes). Journal of Comparative Psychology, 
109, 18-26. 

Lipsett, L. P., & Serunian, S. A. (1963). Oddity- 
problem learning in young children. Child De- 
velopment, 34, 201-206. 

MacDonald, S. E., Pang, J. C., & Gibeault, S. 
(1994). Marmoset (Callithrix jacchus jacchus) 
spatial memory in a foraging task: Win-stay ver- 
sus win-shift strategies. Journal of Comparative 
Psychology, 108, 328-334. 

MacDonald, S. E., & Wilkie, D. M. (1990). 
Yellow-nosed monkeys’ (Cercopithecus asca- 
nius whitesidei) spatial memory in a simulated 
foraging environment. Journal of Comparative 
Psychology, 104, 382-387. 

McGonigle, B. O., & Chalmers, M. (1977). Are 

monkeys logical?. Nature, 267, 694-696. 

Menzel, E. W. Jr. (1973). Chimpanzee spa- 

tial memory organization. Science, 182, 943- 

945: 

Munceer, S. J. (1983). “Conservations” with a 

chimpanzee. Developmental Psychobiology, 16, 

1-11. 

Natale, F. (1989). Causality II: The stick problem. 
In F. Antinucci (Ed.), Cognitive Structure and 
Development in Nonhuman Primates (pp. 121- 
133). Hilldsale, NJ: Erlbaum. 

Natale, FE, Antinucci, FE, Spinozzi, G., & Poti, 
P. (1986). Stage 6 object concept in nonhu- 
man primate cognition: A comparison between 


630 THE CAMBRIDGE HANDBOOK OF THINKING AND REASONING 


gorilla (GohiteYenta Gh iwattos//AisbaeNarylt difference discrimination problem. Journal of 


macaque (Macaca fuscata). Journal of Compar- 
ative Psychology, 100, 335-339. 

Nissen, H. W., Blum, J. S., & Blum, R. A. (1948). 
Analysis of matching behavior in chimpanzee. 
Journal of Comparative and Physiological Psy- 
chology, 41, 62-74. 

Oden, D. L., Thompson, R. K. R., & Premack, 
D. (1988). Spontaneous transfer of matching 
by infant chimpanzees (Pan troglodytes). Jour- 
nal of Experimental Psychology: Animal Behavior 
Processes, 14, 140-145. 

Oden, D. L., Thompson, R. K. R., & Premack, D. 
(1990). Infant chimpanzees spontaneously per- 
ceive both concrete and abstract same/different 
relations. Child Development, 61, 621- 
631. 

Olthof, A., Iden, C. M., & Roberts, W. A. 
(1997). Judgments of ordinality and summa- 
tion of number symbols by squirrel monkeys 
(Saimiri sciureus). Journal of Experimental Psy- 
chology Animal Behavior Processes, 23, 325- 
339- 

Pepperberg, I. M. (1999). The Alex studies. 
Cognitive and communicative abilities of grey 
parrots. Cambridge, MA: Harvard University 
Press. 

Perusse, R., & Rumbaugh, D. M. (1990). Summa- 
tion in chimpanzees (Pan troglodytes): Effects 
of amounts, number of wells, and finer ratios. 
International Journal of Primatology, 11, 425- 
437- 

Piaget, J. (1952). The Origins of Intelligence in Chil- 
dren. New York: Norton. 

Piaget, J., & Inhelder, B. (1941). Le développement 
des quantités physiques chez l'enfant. Neuchatel, 
Switzerland: Delachaux; and Paris: Niestlé. 

Premack, D., & Premack, A. J. (1994). Levels of 
causal understanding in chimpanzees and chil- 
dren. Cognition, 50, 347-362. 

Premack, D. (1976). Intelligence in Ape and Man. 
Hillsdale, NJ: Earlbaum. 

Premack, D. (1983). The codes of man and 
beasts. Behavioral and Brain Sciences, 6, 125- 
167. 

Roberts, W. A. (1998). Principles of Animal Cog- 
nition. Boston: McGraw-Hill. 

Robinson, J. S. (1955). The sameness-difference 
discrimination problem in chimpanzee. Jour- 
nal of Comparative and Physiological Psychology, 
48, 195-197. 

Robinson, J. S. (1960). The conceptual basis of the 
chimpanzee’s performance on the sameness- 


Comparative and Physiological Psychology, 53, 
368-370. 

Rumbaugh, D. M., & McCormack. (1967). The 
learning skills of primates: A comparative study 
of apes and monkeys. In D. Starck, R. Schhei- 
der, & H. J. Ruhn (Eds.), Progress in Prima- 
tology (pp. 289-306). Stuggart: Gustav Fischer 
Verlag. 


Rumbaugh, D. M. (1970). Learning skills of an- 
thropoids. In L. A. Rosenblum (Ed.), Primate 
Behavior: Developments in Field and Labora- 
tory Research (pp. 1-70). New York: Academic 
Press. 


Rumbaugh, D. M., Hopkins, W. D., Washburn, 
D. A., & Savage-Rumbaugh, E. S. (1989). Lana 
chimpanzee learns to count by “Numath”: A 
summary of videotaped experimental report. 
Psychological Record, 39, 459-470. 

Rumbaugh, D. M., Savage-Rumbaugh, E. S., & 
Hegel, M. T. (1987). Summation in the chim- 
panzee (Pan troglodytes). Journal of Experimen- 
tal Psychology: Animal Behavior Processes, 13, 
1 O7-1 1 5 . 

Rumbaugh, D. M., Savage-Rumbaugh, E. S., & 
Pate, J. L. (1988). Addendum to “Summation 
in the chimpanzee (Pan troglodytes)”. Journal of 
Experimental Psychology: Animal Behavior Pro- 
cesses, 14, 118-120. 


Santos, L. R., & Hauser, M. D. (2002). A non- 
human primate’s understanding of solidity: 
Dissociations between seeing and acting. De- 
velopmental Science, 5, 1-7. 

Schino, G., Spinozzi, G., & Berlinguer, L. (1990). 
Object concept and mental representation in 
Cebus apella and Macaca fascicularis. Primates, 
31,537-544- 

Shettleworth, S. J. (1998). Cognition, Evolution, 
and Behavior. New York: Oxford University 
Press. 


Sigg, H. (1986). Ranging patterns in hamadryas 
baboons: Evidence for a mental map. In J. G. 
Else, & P. C. Lee (Eds.), Primate Ontogeny, Cog- 
nition and Social Behaviour (pp. 87-91). Cam- 
bridge, UK: Cambridge University Press. 

Smith, H.J., King, J. E., Witt, E. D., & Rickel, J. E. 
(1975). Sameness—difference matching from 
sample by chimpanzees. Bulletin of the Psycho- 
nomic Society, 6, 469-471. 

Spelke, E. S., Phillips, A., & Woodward, A. L. 
(1995). Infants’ knowledge of object motion 
and human action. In D. Sperber, D. Premack 


& A. J. Premack (Eds.). Causal Cognition. 


REASONING AND THINKING IN NONHUMAN PRIMATES 631 


A Multia panasiciiiattps: M4Shisiianeargocom question is which ones and to what extent. 


York: Oxford University Press. 

Spinozzi, G. & Poti, P. (1989). Causality I: The 
support problem. In F. Antinucci (Ed.), Cog- 
nitive Structure and Development in Nonhu- 
man Primates (pp. 113-119). Hilldsale, NJ: 
Erlbaum. 

Spinozzi, G. & Poti, P. (1993). Piagetian Stage 5 in 
two infant chimpanzees (Pan troglodytes): The 
development of permanence of objects and the 
spatialization of causality. International Journal 
of Primatology, 14, 905-917. 

Suda, C. & Call, J. (in press). Piagetian liquid con- 
servation in the great apes. Journal of Compar- 
ative Psychology. 

Sulkowski, G. M & Hauser, M. D. (2001). Can 
Rhesus monkeys spontaneously subtract? Cog- 
nition, 79, 239-262. 

Swartz, K., Chen, S., & Terrace, H. (1991). Serial 
learning by rhesus monkeys I: Acquisition and 
retention of multiple four item lists. Journal of 
Experimental Psychology: Animal Behavior Pro- 
cesses, 17, 396-410. 

Thomas, R. K., & Boyd, M. G. (1973). A com- 
parison of Cebus albifrons and Saimiri sciureus 
on oddity performance. Animal Learning and 
Behavior, 1, 151-153. 

Thomas, R. K., & Frost, T. (1983). Oddity and 
dimension-abstracted oddity (DAO) in squirrel 
monkeys. American Journal of Psychology, 96, 
51-64. 

Thomas, R. K., & Peay, L. (1976). Length 
judgments by squirrel monkeys: Evidence for 
conservation? Developmental Psychology, 12, 
349-352. 

Thompson, R. K. R., & Oden, D. L. (2000). Cat- 
egorical perception and conceptual judgments 
by nonhuman primates: The paleological mon- 
key and the analogical ape. Cognitive Science, 
24, 363-396. 

Thompson, R. K. R., Oden, D. L., & Boysen, 
S. T. (1997). Language-naive chimpanzees (Pan 
troglodytes) judge relations between relations 
in a conceptual matching-to-sample task. Jour- 
nal of Experimental Psychology: Animal Behavior 
Processes, 23, 31-43. 

Tomasello, M. (1999). The cultural origins of hu- 
man cognition. Cambridge, MA: Harvard Uni- 
versity Press. 

Tomasello, M., & Call, J. (1997). Primate Cogni- 
tion. New York: Oxford University Press. 

Tomasello, M., Call, J. & Hare, B. (2003a). Chim- 


panzees understand psychological states — the 


Trends in Cognitive Sciences, 7, 153-156. 
Tomasello, M., Call, J. & Hare, B. (2003b). Chim- 

panzees versus humans: It’s not that simple. 

Trends in Cognitive Sciences, 7, 239-240. 


Vauclair, J. (1996). Animal Cognition. An In- 
troduction to Modern Comparative Cognition. 
Cambridge, MA: Harvard University Press. 


Visalberghi, E. (1993). Tool use in a South 
American monkey species: An overview of the 
characteristics and limits of tool use in Cebus 
apella. In A. Berthelet, & J. Chavaillon (Eds.), 
The Use of Tools by Human and Non-Human Pri- 
mates (pp. 118-131). New York: Oxford Uni- 
versity Press. 


Visalberghi, E., Fragaszy, D. M., & Savage- 
Rumbaugh, E. S. (1995). Performance in 
a tool-using task by common chimpanzees 
(Pan troglodytes), bonobos (Pan paniscus), an 
orangutan (Pongo pygmaeus), and capuchin 
monkeys (Cebus apella). Journal of Compara- 
tive Psychology, 109, 52-60. 

Visalberghi, E., & Limongelli, L. (1994). Lack 
of comprehension of cause-effect relations in 
tool-using capuchin monkeys (Cebus apella). 
Journal of Comparative Psychology, 108, 15- 
22. 


Visalberghi, E., & Trinca, L. (1989). Tool use 
in capuchin monkeys: Distinguishing between 
performing and understanding. Primates, 30, 
511-521. 

Vonk, J. (2003). Gorilla (Gorilla gorilla gorilla) 
and orangutan (Pongo abelii) understanding of 
first- and second-order relations. Animal Cog- 
nition, 6, 77-86. 

Washburn, D. A. (1992). Human factors with 
nonhumans: Factors that affect computer-task 
performance. International Journal of Compar- 
ative Psychology, 5, 191-204. 

Washburn, D. A., & Rumbaugh, D. M. (1991). 
Ordinal judgements of numerical symbols by 
macaques (Macaca mulatta). Psychological Sci- 
ence, 2, 190-193. 

Washburn, D. A., & Rumbaugh, D. M. 
(1992). Comparative assessment of psychomo- 
tor performance: Target prediction by humans 
and macaques (Macaca mulatta). Journal of 
Experimental Psychology: General, 121, 305- 
312. 

Weinstein, B. (1941). Matching-from-sample by 
Rhesus monkeys and by children. Journal of 
Comparative and Physiological Psychology, 31, 
195-213. 


632 THE CAMBRIDGE HANDBOOK OF THINKING AND REASONING 


Woodruff, GPrene@ntate diy: tetpa Unt inargnt@rabe delay effects. Journal of Experimental 


(1978). Conservation of liquid and solid quan- Psychology: Animal Behavior Processes, 10, 513- 

tity by the chimpanzee. Science, 202, 991- 529. 

994- Wright, A. A., Shyan, M., & Jitsumori, M. 
Wright, A. A., Santiago, H. C., & Sands, S. F. (1990). Auditory same/different concept learn- 

(1984). Monkey memory: Same/different con- ing by monkeys. Animal Learning and Behavior, 


cept learning, serial probe acquisition, and 18, 287-294. 


Prevertete bby: Inttas /4etitiianargatom 


CHAPTER 26 


Language and Thought 


Lila Gleitman 
Anna Papafragou 


Possessing a language is one of the cen- 
tral features that distinguishes humans from 
other species. Many people share the intu- 
ition that they think in language and the ab- 
sence of language therefore would be the 
absence of thought. One compelling ver- 
sion of this self-reflection is Helen Keller’s 
(1955) report that her recognition of the 
signed symbol for ‘water’ triggered thought 
processes that had theretofore — and conse- 
quently — been utterly absent. Statements to 
the same or related effect come from the 
most diverse intellectual sources: “The limits 
of my language are the limits of my world” 
(Wittgenstein, 1922); and “The fact of the 
matter is that the ‘real world’ is to a large ex- 
tent unconsciously built upon the language 
habits of the group” (Sapir, 1941, as cited in 
Whorf 1956, p. 75). 

The same intuition arises with regard to 
particular languages and dialects. Speaking 
the language of one’s childhood seems to 
conjure up a host of social and cultural at- 
titudes, beliefs, memories, and emotions, as 
though returning to the Casbah or to Av- 
enue L and East 19" Street and conversing 
with the natives opens a window back into 


some prior state of one’s nature. But do such 
states of mind arise because one is literally 
thinking in some new representational for- 
mat by speaking in a different language? Af- 
ter all, many people experience the same or 
related changes in sociocultural orientation 
and sense of self when they are, say, wear- 
ing their battered old jeans versus some re- 
quired business suit or military uniform; or 
even more poignantly when they reexperi- 
ence a smell or color or sound associated 
with dimly recalled events. Many such ex- 
periences evoke other times, other places. 
But according to many anthropological 
linguists, sociologists, and cognitive psychol- 
ogists, speaking a particular language ex- 
erts vastly stronger and more pervasive in- 
fluences than an old shoe or the smell of 
boiling cabbage. The idea of “linguistic rel- 
ativity” is that having language, or having a 
particular language, crucially shapes mental 
life. Indeed, it may not be only that a spe- 
cific language exerts its idiosyncratic effects 
as we speak or listen to it — that language 
might come to “be” our thought; we may 
have no way to think many thoughts, con- 
ceptualize many of our ideas, without this 
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this language. From such a perspective, dif- 
ferent communities of humans, speaking dif- 
ferent languages, would think differently to 
the extent that languages differ from one an- 
other. But is this so? Could it be so? That 
depends on how we unpack the notions al- 
luded to so informally thus far. 

In one sense, it is obvious that language 
use has powerful and specific effects on 
thought. That’s what it is for, or at least that 
is one of the things it is for — to transfer 
ideas from one mind to another mind. Imag- 
ine Eve telling Adam “Apples taste great.” 
This fragment of linguistic information, as 
we know, caused Adam to entertain a new 
thought with profound effects on his world 
knowledge, inferencing, and subsequent be- 
havior. Much of human communication is 
an intentional attempt to modify others’ 
thoughts and attitudes in just this way. This 
information transmission function is crucial 
for the structure and survival of cultures and 
societies in all their known forms. 

But the language-and-thought debate is 
not framed to query whether the content 
of conversation can influence one’s attitudes 
and beliefs, for the answer to that question 
is too obvious for words. At issue, rather, is 
the degree to which natural languages pro- 
vide the format in which thought is neces- 
sarily (or at least habitually) couched. Do 
formal aspects of a particular linguistic sys- 
tem (e.g., features of the grammar or the 
lexicon) organize the thought processes of 
its users? One famous “Aye” to this question 
appears in the writings of B. L. Whorf in the 
first half of the twentieth century. Accord- 
ing to Whorf (1956, p. 214), the grammatical 
and lexical resources of individual languages 
heavily constrain the conceptual representa- 
tions available to their speakers. To quote: 


We are thus introduced to a new principle 
of relativity, which holds that all observers 
are not led by the same physical evidence 
to the same picture of the universe, unless 
their linguistic backgrounds are similar, or 
can in some way be calibrated. 


This relativistic view, in its strictest form, 
entails that linguistic categories will be the 


tal activity” (Ref 143, p. 212), including cate- 
gorization, memory, reasoning, and decision 
making. If this is right, then the study of 
different linguistic systems may throw light 
onto the diverse modes of thinking encour- 
aged or imposed by such systems. Here is 
a recent formulation of this view (Pederson 
et al., 1998, p. 586): 


We surmise that language structure . . . pro- 
vides the individual with a system of rep- 
resentation, some isomorphic version of 
which becomes highly available for incor- 
poration as a default conceptual represen- 
tation. Far more than developing simple ha- 
bituation, use of the linguistic system, we 
suggest, actually forces the speaker to make 
computations he or she might otherwise 
not make. 


Even more dramatically, according to 
stronger versions of this general position, we 
can newly understand much about the de- 
velopment of concepts in the child mind: 
One acquires concepts as a consequence of 
their being systematically instantiated in the 
exposure language (Bowerman & Levinson, 


2001, p.13): 


Instead of language merely reflecting the 
cognitive development which permits and 
constrains its acquisition, language is 
thought of as potentially catalytic and 
transformative of cognition. 


The importance of this position cannot 
be underestimated: Language here becomes 
a vehicle for the growth of new concepts — 
those that were not theretofore in the mind, 
and perhaps could not have been there with- 
out the intercession of linguistic experience. 
It therefore poses a challenge to the vener- 
able view that one could not acquire a con- 
cept that one could not antecedently enter- 
tain (Plato, 5th-4th B.c.z.; Descartes, 1662; 
Fodor, 1975, inter alia]. 

Quite a different position is that language, 
although being the central human conduit 
for thought in communication, memory, and 
planning, neither creates nor materially dis- 
torts conceptual life: Thought is first; lan- 
guage is its expression. This contrasting view 
of cause and effect leaves the link between 
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just as relevant for understanding mental 
life. From Noam Chomsky’s universalist per- 
spective (1975, p. 4), for example, the forms 
and contents of all particular languages de- 
rive, in large part, from an antecedently spec- 
ified cognitive substance and architecture 
and therefore provide a rich diagnostic of 
human conceptual commonalities: 


Language is a mirror of mind in a deep and 
significant sense. It is a product of human 
intelligence... By studying the properties of 
natural languages, their structure, organi- 
zation, and use, we may hope to learn some- 
thing about human nature; something sig- 
nificant, if it is true that human cognitive 
capacity is the truly distinctive and most 
remarkable characteristic of the species. 


This view of concepts as prior to and pro- 
genitive of language is not proprietary to 
the rationalist position for which Chomsky 
is speaking here. This commonsensical po- 
sition is maintained — rather, presupposed — 
by students of the mind who differ among 
themselves in almost all other regards. The 
early empiricists, for example, took it for 
granted that our concepts derive from expe- 
rience with properties, things, and events in 
the world and not, originally, from language 
(Hume, 1739; Book I): 


To give a child an idea of scarlet or or- 
ange, of sweet or bitter, I present the ob- 
jects, or in other words, convey to him these 
impressions; but proceed not so absurdly, 
as to endeavor to produce the impressions 
by exciting the ideas. 


And as a part of such experience of 
objects, language learning will come along 
for the ride (Locke, 1690, Book 3.IX.9; 
emphasis ours): 


If we will observe how children learn lan- 
guages, we shall find that, to make them 
understand what the names of simple ideas 
or substances for, people ordinarily show 
them the thing whereof they would have 
them have the idea; and then repeat to 
them the name that stands for it... 


Thus linguistic relativity, in the sense of 
Whorf and many recent commentators, is 


tions, revolutionary. At the limit, it is a pro- 
posal for how new thoughts can arise in the 
mind as a result of experience with language 
rather than as a result of experience with the 
world of objects and events. 

Before turning to the recent literature 
on language and thought, we want to em- 
phasize that there are no ideologues ready 
to man the barricades at the absolute ex- 
tremes of the debate just sketched. To our 
knowledge, none — well, very few — of those 
who are currently advancing linguistic— 
relativistic themes and explanations believe 
that infants enter into language acquisition 
in a state of complete conceptual naked- 
ness later redressed (perhaps we should say 
“dressed”) by linguistic information. Rather, 
by general acclaim, infants are believed to 
possess some “core knowledge” that enters 
into first categorization of objects, proper- 
ties, and events in the world (e.g., Carey, 
1982; Kellman, 1996; Baillargeon, 1993; 
Gelman & Spelke, 1981; Leslie & Keeble, 
1987; Mandler, 1996; Quinn, 2001; Spelke 
et al., 1992). The general question is how 
richly specified this innate basis may be 
and how experience refines, enhances, and 
transforms the mind’s original furnishings. 
The specific question is whether language 
knowledge may be one of these formative or 
transformative aspects of experience. To our 
knowledge, none — well, very few — of those 
who adopt a nativist position on these mat- 
ters reject as a matter of a priori conviction 
the possibility that there could be salience 
effects of language on thought. For instance, 
some particular natural language might for- 
mally mark a category whereas another does 
not; two languages might draw a category 
boundary at different places; two languages 
might differ in the computational resources 
they require to make manifest a particular 
distinction or category. 

We will try to draw out aspects of these 
issues within several domains in which com- 
mentators and investigators are trying to dis- 
entangle cause and effect in the interaction 
of language and thought. We cannot dis- 
cuss it all, of course, or even very much 
of what is currently in print on this topic. 
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gies, see Gumperz & Levinson, 1996; Bower- 
man & Levinson, 2001; Gentner & Goldin- 
Meadow, 2003). 


Do We Think In Language? 


We begin with a very simple question: Do 
our thoughts take place in natural language? 
If so, it would immediately follow that 
Whorf was right all along, since speak- 
ers of Korean and Spanish, or Swahili and 
Hopi would have to think systematically dif- 
ferent thoughts. 

If language directly expresses our 
thought, it seems to make a poor job of it. 
Consider for example the final (nonparen- 
thetical) sentence in the preceding section: 


1. There is too much of it. 


Leaving aside, for now, the problems 
of anaphoric reference (what is “it’?), the 
sentence still has at least two interpre- 
tations that are compatible with its dis- 
course context: 


1a. There is too much written on linguistic 
relativity to fit into this article. 


1b. There is too much written on linguistic 
relativity. (Period!) 


We authors had one of these two inter- 
pretations in mind (guess which one). We 
had a thought and expressed it as (1) but 
English failed to render that thought unam- 
biguously, leaving doubt between (1a) and 
(1b). One way to think about what this ex- 
ample portends is that language cannot, or in 
practice does not, express all and only what 
we mean. Rather, language use offers hints 
and guideposts to hearers, such that they can 
usually reconstruct what the speaker had in 
mind by applying to the uttered words a 
good dose of common sense — aka thoughts, 
inferences, and plausibilities — in the world. 

The question of just how to apportion the 
territory between the underlying semantics 
of sentences and the pragmatic interpreta- 
tion of the sentential semantics, of course, is 
far from settled in linguistic and philosoph- 


raining. Does this sentence directly — that is, 
as an interpretive consequence of the linguis- 
tic representation itself — convey an assertion 
about rain falling here, in the immediate ge- 
ographical environment of the speaker? Or 
does the sentence — the linguistic represen- 
tation — convey only that rain is falling, leav- 
ing it for the common sense of the listener to 
deduce that the speaker likely meant raining 
here and now rather than raining today in 
Bombay or on Mars; likely, too, that if the 
sentence was uttered indoors, the speaker 
more likely meant here to convey “just out- 
side of here” than “right here, as the roof 
is leaking.” The exact division of labor be- 
tween linguistic semantics and pragmatics 
has implications for the language-thought 
issue, because the richer (one claims that) 
the linguistic semantics is, the more likely it 
is that language guides our mental life. With- 
out going into detail, we will argue that lin- 
guistic semantics cannot fully envelop and 
substitute for inferential interpretation, and 
the representations that populate our mental 
life therefore cannot be identical to the rep- 
resentations that encode linguistic (seman- 
tic) meaning. 


Language Is Sketchy, Thought Is Rich 


There are several reasons to believe that 
thought processes are not definable over rep- 
resentations that are isomorphic to linguis- 
tic representations. One is the pervasive am- 
biguity of words and sentences. Bat, bank, 
and bug all have multiple meanings in En- 
glish and are associated with multiple con- 
cepts, but these concepts themselves are 
clearly distinct in thought, as shown inter 
alia by the fact that one may consciously 
construct a pun. Moreover, several linguis- 
tic expressions including pronouns (he, she) 
and indexicals (here, now) crucially rely on 
context for their interpretation whereas the 
thoughts they are used to express are usu- 
ally more specific. Our words are often se- 
mantically general — i.e., they fail to make 
distinctions that nevertheless are present in 
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cally specify whether the individual comes 
from the mother’s or the father’s side, or 
whether he is a relative by blood or mar- 
riage, but usually the speaker who utters 
“my uncle...” possesses the relevant infor- 
mation. Indeed, lexical items typically take 
on different interpretations tuned to the oc- 
casion of use (He has a square face. The room 
is hot.) and depend on inference for their 
precise construal in different contexts (e.g., 
the implied action is systematically differ- 
ent when we open an envelope/a can/an um- 
brella/a book, or when an instance of that 
class of actions is performed to serve dif- 
ferent purposes: Open the window to let in 
the evening breeze/the cat). Moreover, there 
are cases in which linguistic output does 
not even encode a complete thought or 
proposition (tomorrow, maybe). Finally, the 
presence of implicatures and other kinds 
of pragmatic inference ensures that — to 
steal a line from the Mad Hatter — although 
speakers generally mean what they say, they 
do not and could not say exactly what 
they mean. 

From this and related evidence, it ap- 
pears that linguistic representations under- 
determine the conceptual contents they are 
used to convey: Language is sketchy com- 
pared with the richness of our thoughts (for 
a related discussion, see Fisher & Gleitman, 
2002). In light of the limitations of lan- 
guage, time, and sheer patience, language 
users make reference by whatever catch- 
as-catch-can methods they find handy, in- 
cluding the waitress who famously told an- 
other that “The ham sandwich wants his 
check” (Nunberg, 1978). What chiefly mat- 
ters to talkers and listeners is that successful 
reference be made, whatever the means at 
hand. If one tried to say all and exactly what 
one meant, conversation could not happen; 
speakers would be lost in thought. Instead, 
conversation involves a constant negotiation 
in which participants estimate and update 
each others’ background knowledge as a ba- 
sis for what needs to be said given what is 
mutually known and inferable (e.g., Grice, 
1975; Sperber & Wilson, 1986; Clark, 1992; 
Bloom, 2002). 


nore linguistically encoded meaning if it 
patently differs from what the speaker in- 
tended - for instance, by smoothly and 
rapidly repairing slips of the tongue. Oxford 
undergraduates had the wit, if not the grace, 
to snicker when Reverend Spooner reput- 
edly said, “Work is the curse of the drinking 
classes.” Often, the misspeaking is not even 
consciously noticed but is repaired to fit the 
thought — evidence enough that the word 
and the thought are two different matters.’ 
The same latitude for thought to range be- 
yond established linguistic means holds for 
the speakers, too. Wherever the local lin- 
guistic devices and locutions seem insuff- 
cient or overly constraining, speakers invent 
or borrow words from another language, de- 
vise similes and metaphors, and sometimes 
make permanent additions and subtractions 
to the received tongue. It would be hard to 
understand how they do so if language were 
itself, and all at once, both the format and 
vehicle of thought. 

All the cases just mentioned refer to par- 
ticular tokenings of meanings in the id- 
iosyncratic interactions between people. A 
different problem arises when languages 
categorize aspects of the world in ways that 
are complex and inconsistent. An example is 
reported by Malt et al. (1999). They exam- 
ined the vocabulary used by English, Span- 
ish, and Chinese subjects to label the various 
containers we bring home from the grocery 
store full of milk, juice, ice cream, bleach, or 
medicine (e.g., jugs, bottles, cartons, boxes). 
As the authors point out, containers share 
names based not only on some perceptual re- 
semblances but also on very local and partic- 
ular conditions with size, shape, substance, 
contents, and nature of the contents, not 
to speak of the commercial interests of the 
purveyor, all playing interacting and shift- 
ing roles. In present-day American English, 
for instance, a certain plastic container that 
looks like a bear with a straw stuck in its 
head is called a juice box, although it is not 
boxy either in shape (square or rectangu- 
lar) or typical constitution (your prototypi- 
cal American box is made of cardboard). The 
languages Malt et al. studied differ markedly 
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and also in how their subjects extended these 
terms to describe diverse new containers. 
Speakers of the three languages differed in 
which objects (old and new) they classified 
together by name. For example, a set of ob- 
jects distributed across the sets of jugs, con- 
tainers, and jars by English speakers were 
unified by the single label frasco by Spanish 
speakers. Within and across languages, not 
everything square is a box, not everything 
glass is a bottle, not everything not glass is not 
a bottle, and so on. The naming, in short, is 
a complex mix resulting from perceptual re- 
semblances, historical influences, and a gen- 
erous dollop of arbitrariness. Yet Malt et al.’s 
subjects did not differ much (if at all) from 
each other in their classification of these con- 
tainers by overall similarity rather than by 
name. Nor were the English and Spanish, as 
one might guess, more closely aligned than, 
say, the Chinese and Spanish. So here we 
have a case in which cross-linguistic practice 
groups objects in a domain in multiple ways 
that have only flimsy and sporadic correla- 
tions with perception without discernible ef- 
fect on the nonlinguistic classificatory behav- 
iors of users.” 

So far, we have emphasized that language 
is a relatively impoverished and underspeci- 
fied vehicle of expression that relies heavily 
on inferential processes outside the linguistic 
system for reconstructing the richness and 
specificity of thought. If correct, this seems 
to place rather stringent limitations on how 
language could serve as the original engine 
and sculptor of our conceptual life. Never- 
theless, it is possible to maintain the idea 
that certain formal properties of language 
causally affect thought in more subtle, but 
still important, ways. 


Use It or Lose It: Language 
Determines the Categories of Thought 


We begin by mentioning the most famous 
and compelling case of a linguistic influ- 
ence on perception: categorical perception 
of the phoneme (Liberman, 1970; Liberman 


begin life with the capacity and inclination 
to discriminate among all of the acoustic— 
phonetic properties by which languages en- 
code distinctions of meaning — a result fa- 
mously documented by Peter Eimas (Eimas 
et al., 1971) using a dishabituation paradigm 
(for details and significant expansions of this 
basic result, see Jusczyk, 1985; and for ex- 
tensions with neonates, Pefia et al., 2003). 
These authors showed that an infant will 
work (e.g., turn its head or suck on a nip- 
ple) to hear a syllable such as ba. After some 
period of time, the infant habituates; that 
is, its sucking rate decreases to some base 
level. The high sucking rate can be rein- 
stated if the syllable is switched to, say, pa, 
demonstrating that the infant detects the dif 
ference. These effects are heavily influenced 
by linguistic experience. Infants only a year 
or so of age — just when true language is 
making its appearance — have become in- 
sensitive to phonetic distinctions that are 
not phonemic (play no role at higher lev- 
els of linguistic organization) in the expo- 
sure language (Werker & Tees, 1984). Al- 
though these experience-driven effects are 
not totally irreversible in cases of long-term 
second-language immersion, they are perva- 
sive and dramatic (for discussion, see Werker 
& Logan, 1985; Best, McRoberts, & Sithole, 
1988). Without special training or unusual 
talent, the adult speaker—listener can effec- 
tively produce and discriminate the phonetic 
categories required in the native tongue, 
and little more. Not only that, these dis- 
criminations are categorical in the sense 
that sensitivity to within-category phonetic 
distinctions is poor and sensitivity at the 
phonemic boundaries is especially acute. Al- 
though the learning and use of a specific lan- 
guage has not created perceptual elements 
de novo, certainly it has refined, organized, 
and limited the set of categories at this level 
in radical ways. As we will discuss, sev- 
eral findings in the concept-learning litera- 
ture have been interpreted analogously to 
this case. 

Aneven more intriguing effect in this gen- 
eral domain is the reorganization of phonetic 
elements into higher-level phonological 
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guage spoken. For example, American En- 
glish speech regularly lengthens vowels in 
syllables ending with a voiced consonant 
(e.g., ride and write) and neutralizes the t/d 
distinction in favor of a dental flap in cer- 
tain unstressed syllables. The effect is that 
(in most dialects) the consonant sounds in 
the middle of rider and writer are physically 
the same. Yet the English-speaking listener 
seems to perceive a d/t difference in these 
words all the same, and — except when asked 
to reflect carefully — fails to notice the char 
acteristic difference in vowel length that his 
or her own speech faithfully reflects. The 
complexity of this phonological reorganiza- 
tion is often understood as a reconciliation 
(interface) of the cross-cutting phonetic and 
morphological categories of a particular lan- 
guage. Ride ends with a d sound; write ends 
with a t sound; morphologically speaking, 
rider and writer are just ride and write with 
er added on; therefore, the phonetic entity 
between the syllables in these two words 
must be d in the first case and ¢ in the sec- 
ond. Morphology trumps phonetics (for dis- 
cussion see Bloch & Trager, 1942; Chomsky, 
1964; Gleitman & Rozin, 1977). 

When considering linguistic relativity, 
one might be tempted to write off the pho- 
netic categorical perception effect as one 
that merely tweaks the boundaries of acous- 
tic distinctions built into the mammalian 
species — a not-so-startling sensitizing effect 
of language on perception. But the phono- 
logical effect just discussed is no mere tweak. 
There has been a systemic reorganization 
creating a new set of lawfully recombinato- 
rial elements — one that varies very signifi- 
cantly cross-linguistically. 

Much of the literature on linguistic rel- 
ativity can be understood as raising related 
issues in various perceptual and conceptual 
domains. Is it the case that distinctions of 
lexicon or grammar made regularly in one’s 
language sensitize one to these distinctions 
and suppress or muffle others? Even to the 
extent of radically reorganizing the domain? 
An important literature has investigated this 
issue using the instance of color names and 
color perception. Languages differ in their 


1969; cf Kay & Regier, 2002). Do psy- 
chophysical judgments differ accordingly? 
For instance, are adjacent hues that share a 
name in a particular language judged more 
similar by its speakers than equal-magnitude 
differences in wavelength and intensity that 
are consensually given different names in 
that language? And are the similarity spaces 
of speakers of other languages different in 
the requisite ways? Such language-caused 
distinctions have been measured in various 
ways — for example, discrimination across 
hue labeling boundaries (speed, accuracy, 
confusability), memory, and population 
comparisons. By and large, the results of such 
cross-linguistic studies suggest a remarkable 
independence of hue perception from label- 
ing practice (e.g., Brown & Lenneberg, 1954; 
Heider & Oliver, 1972). One relevant finding 
comes from red-green color-blind individ- 
uals (Jameson & Hurwich, 1978). The per- 
ceptual similarity space of the hues for such 
individuals is systematically different from 
that of individuals of normal vision; that is 
what it means to be colorblind. Yet a large 
subpopulation of red-green colorblind in- 
dividuals names hues, even of new things, 
consensually with normal-sighted individu- 
als and orders these hue labels consensually. 
That is, these individuals do not perceptually 
order a set of color chips with the reds at one 
end, the greens at the other, and the oranges 
somewhere in between; yet they organize 
the words with red semantically at one end, 
green at the other, and orange somewhere 
in between. In short, the naming practices 
and perceptual organization of color mis- 
match in these individuals, which is a fact 
that they rarely notice until they enter the 
vision laboratory. 

Overall, the language-thought relations 
for one perceptual domain (speech-sound 
perception) appear to be quite differ 
ent from those in another perceptual do- 
main (hue perception). Language influences 
acoustic phonetic perception much more 
than it influences hue perception. As a re- 
sult, there is no deciding in advance that lan- 
guage does or does not influence perceptual 
life. Moreover, despite the prima facie 
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of the literature that investigated them, the 
perception of relatively low-level percep- 
tual categories, the organization of which 
we share with many nonhuman species, are 
less than ideal places to look for the lin- 
guistic malleability of thought. However, 
these instances serve to scaffold discussion 
of language influences at higher levels and 
therefore for more elusive aspects of concep- 
tual organization. 


Do the Categories of Language 
Become the Categories of Thought? 


A seminal figure in reawakening interest in 
linguistic relativity was Roger Brown, the 
great social and developmental psychologist 
who framed much of the field of language ac- 
quisition in the modern era. Brown (1957) 
performed a simple and elegant experiment 
that demonstrated an effect of lexical cate- 
gorization on the inferred meaning of a new 
word. Young children were shown a picture, 
for example, of hands that seemed to be 
kneading confettilike stuff in an overflow- 
ing bowl. Some children were told Show 
me the sib. They pointed to the bowl (a 
solid rigid object). Others were told Show 
me some sib. They pointed to the confetti 
(an undifferentiated mass of stuff). Others 
were told Show me sibbing. They pointed 
to the hands and made kneading motions 
with their own hands (an action or event). 
Plainly, the same stimulus object was repre- 
sented differently depending on the linguis- 
tic cues to the lexical categories count noun, 
mass noun, and verb. That is, the lexical cate- 
gories themselves have notional correlates — 
at least in the minds of these young Eng- 
lish speakers. 

Some commentators have argued that the 
kinds of cues exemplified here — that per- 
sons, places, and things surface as nouns — 
are universal and can play causal roles in 
the acquisition of language — of course, by 
learners who are predisposed to find just 
these kinds of syntactic-semantic correla- 


Fisher, 1996; Bloom, 1994a; Lidz, Gleitman, 
& Gleitman, 2003; Baker, 2001, inter alia). 
Brown saw his result the other way around. 
He supposed that languages would vary ar- 
bitrarily in these mappings onto conceptual 
categories. If that is so, then language can- 
not play the causal role that Pinker and oth- 
ers envisaged for it — that is, as a cue to an- 
tecedently “prepared” correlations between 
linguistic and conceptual categories. Rather, 
those world properties yoked together by 
language would cause a (previously uncom- 
mitted) infant learner to conceive them as 
meaningfully related in some ways (Brown, 


1957, P- 5): 


In learning a language, therefore, it must be 
useful to discover the semantic correlates 
for the various parts of speech; for this dis- 
covery enables the learner to use the part- 
of-speech membership of a new word as a 
first cue to its meaning... Since [grammat- 
ical categories] are strikingly different in 
unrelated languages, the speakers [of these 
languages] may have quite different cogn- 
itive categories. 


As recent commentators have put this po- 
sition, linguistic regularities are part of the 
correlational mix that creates ontologies, and 
language-specific properties therefore will 
bend psychological ontologies in language- 
specific ways (Smith, Colunga, & Yoshida, 
2001). The forms of particular languages — or 
the habitual language usage of particular lin- 
guistic communities — by hypothesis, could 
yield different organizations of the funda- 
mental nature of one’s conceptual world: 
what it is to be a thing or some stuff, or a 
direction or place, or a state or event. We 
will discuss some research on these category 
types and their cross-linguistic investigation. 
But before doing so, we want to mention 
another useful framework for understand- 
ing potential relations between language and 
thought: that the tweakings and reorganiza- 
tions language may accomplish happen un- 
der the dynamic control of communicative 
interaction, of “thinking for speaking.” 
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It is natural to conceive conversation as be- 
ginning with a thought or mental message 
one wishes to convey. This thought is the 
first link in a chain of mental events that, 
on most accounts, gets translated into suc- 
cessively more languagelike representations, 
eventuating in a series of commands to the 
articulatory system to utter a word, phrase, 
or sentence (Levelt, 1989; Dell, 1995). As 
we have just described matters, there is a 
clear distinction at the two ends of this pro- 
cess — what you meant to say and how you 
express it linguistically. But this is not so 
clear. Several commentators, notably Dan 
Slobin (1996, 2003), have raised the possi- 
bility of a more dynamic and interactive pro- 
cess in which what one chooses to mean and 
the expressive options that one’s language 
makes available are not so neatly divorced. 
It may not be that speakers of every language 
set out their messages identically all the way 
up to the time that they arrange the jaw, 
mouth, and tongue to utter one two three ver- 
sus un deux trois. Instead, the language one 
has learned causes one to “intend to mean” in 
somewhat different ways. For instance, and 
as we will discuss in more detail, it may be 
that as a speaker of English, with its myriad 
verbs of manner of motion, one comes to in- 
spect — and speak of — the world in terms of 
such manners, whereas a speaker of Greek 
or Spanish, with a vocabulary emphasizing 
verbs relating to path of motion, inspects — 
and speaks of — the world more directly in 
terms of the paths traversed. The organi- 
zation of the thought, on this view, might 
be dynamically impacted along its course by 
specific organizational properties of the in- 
dividual language. 

Slobin (2001) and Levelt (1989) have 
pointed to some cases in which a distinc- 
tion across languages in the resources de- 
voted to different conceptual matters seems 
almost inevitable. This case is the closed- 
class functional vocabulary, the “grammat- 
ical” words such as modals, auxiliaries, tense 
and aspect markers, determiners, comple- 
mentizers, case markers, prepositions, and so 


matical roles in marking the ways in which 
noun phrases relate to the verb and how 
the predications within a sentence relate to 
each other. These same grammatical words 
usually also have semantic content — for ex- 
ample, the directional properties of from in 
John separated the wheat from the chaff. Slobin 
has given a compendium of the semantic 
functions known to be expressed by such 
items and these number at least in the several 
hundreds, including not only tense, aspect, 
causativity, number, person, gender, mood, 
definiteness, and so on, found in English, but 
also first-hand versus inferred knowledge, 
social status of the addressee, existence— 
nonexistence, shape, and many others. Both 
Slobin and Levelt have argued as follows: As 
a condition of uttering a well-formed English 
sentence, the speaker of English must decide 
for example, whether the number of crea- 
tures being referred to is one or more in order 
to choose the dog or the dogs. Some mod- 
icum of mental resources, no matter how 
small, must be devoted to this issue repeat- 
edly — hundreds of times a day every day, 
every week, every year — by English speak- 
ers. But speakers of Mandarin need not think 
about number, except when they particu- 
larly want to, because its expression is not 
grammaticized in their language. The same 
is true for all the hundreds of other prop- 
erties. So either all speakers of languages 
covertly compute all these several hundred 
properties as part of their representations of 
the contents of their sent and received mes- 
sages or they compute only some of them — 
primarily those that they must compute 
to speak and understand the language of 
their community. On information-handling 
grounds, one would suspect that not all 
these hundreds of conceptual interpreta- 
tions and their possible combinations are 
computed at every instance. But if one com- 
putes only what one must for the com- 
bined purposes of linguistic intelligibility 
and present communicative purpose, then 
speakers of different languages, to this ex- 
tent, must be thinking differently. As Slobin 
2001, p. 442) puts it, “From this point of 
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structuring language-specific mental spaces, 
rather than being there at the beginning, 
waiting for an input language to turn them 
on.” On the basis of this reasoning, it is plau- 
sible to entertain the view of a language- 
based difference in the dynamics of convert- 
ing thought to speech. How far such effects 
percolate downstream is the issue to which 
we now turn. Do differences in phraseology, 
grammatical morphology, and lexical seman- 
tics of different languages yield underlying 
disparities in their modes of thought? 


Semantic Arenas of the Present Day 
Language-Thought Investigation 


Objects and Substances 


The problem of reference to stuff versus ob- 
jects has attracted considerable attention be- 
cause it starkly displays the indeterminacy in 
how language refers to the world (Chomsky, 
1957; Quine, 1960). Whenever we indicate a 
physical object, we necessarily indicate some 
portion of a substance as well; the reverse 
is also true. Languages differ in their ex- 
pression of this distinction (Lucy & Gaskins, 
2001). Some languages make a grammatical 
distinction that roughly distinguishes object 
from substance. Count nouns in such lan- 
guages denote individuated entities; such as, 
object kinds. These are marked in English 
with determiners and are subject to counting 
and pluralization (a horse, horses, two horses). 
Mass nouns typically denote nonindividu- 
ated entities — that is, substance rather than 
object kinds. These are marked in English 
with a different set of determiners (more 
porridge) and need an additional term that 
specifies quantity to be counted and plu- 
ralized (a tube of toothpaste rather than a 
toothpaste). Soja, Carey, and Spelke (1991) 
asked whether children approach this aspect 
of language learning already equipped with 
the ontological distinction between things 
and substance or whether they are led to 
make this distinction through learning count 
and mass syntax. Their subjects, English- 
speaking two-year-olds, did not yet make 


(1991) taught these children words in refer- 
ence to various types of unfamiliar displays. 
Some were solid objects such as a T-shaped 
piece of wood, and others were nonsolid sub- 
stances such as a pile of hand cream with 
sparkles in it. The children were shown such 
a sample, named with a term presented in 
a syntactically neutral frame that identified 
it neither as a count nor as a mass noun — 
for example, This is my blicket or Do you see 
this blicket? In extending these words to new 
displays, two-year-olds honored the distinc- 
tion between object and substance. When 
the sample was a hard-edged solid object, 
they extended the new word to all objects 
of the same shape, even when made of a dif- 
ferent material. When the sample was a non- 
solid substance, they extended the word to 
other-shaped puddles of that same substance 
but not to shape matches made of different 
materials. Soja et al. took this finding as ev- 
idence of a conceptual distinction between 
objects and stuff, independent of and prior 
to the morphosyntactic distinction made in 
English. 

This interpretation was put to stronger 
tests by extending such classificatory tasks 
to languages that differ from English in 
these regards: Either these languages do not 
grammaticize the distinction, or they orga- 
nize it in different ways (see Lucy, 1992; 
Lucy & Gaskins, 2001, for findings from 
Yucatec Mayan; Mazuka & Friedman, 2000; 
Imai & Gentner, 1997, for Japanese). Es- 
sentially, nouns in these languages all start 
life as mass terms, requiring a special gram- 
matical marker (called a classifier) to be 
counted. One might claim, then, that sub- 
stance is in some sense linguistically basic for 
Japanese whereas objecthood is basic for En- 
glish speakers because of the dominance of 
its count-noun morphology.t So if children 
are led to differentiate object and substance 
reference by the language forms themselves, 
the resulting abstract semantic distinction 
should differ cross-linguistically. To test this 
notion, Imai and Gentner replicated the 
tests of Soja et al. with Japanese and En- 
glish children and adults. Some of their find- 
ings appear to strengthen the evidence for a 
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us to think about both individual objects and 
portions of stuff because both American and 
Japanese children (even two-year-olds) ex- 
tended names for complex hard-edged non- 
sense objects on the basis of shape rather 
than substance. The lack of separate gram- 
matical marking did not put Japanese chil- 
dren at a disadvantage in this regard. 

Another aspect of the results hints at a 
role for language in categorization, however. 
Japanese children tended to extend names 
for mushy hand cream displays according to 
their substance, for example, whereas Amer- 
ican children were at chance for these items. 
There were also discernible language effects 
on word extension for certain very simple 
stimuli (e.g., a kidney bean-shaped piece of 
colored wax) that seemed to fall at the on- 
tological midline between object and sub- 
stance. Whereas the J apanese at ages two and 
four years were at chance on these items, En- 
glish speakers showed a tendency to extend 
words for them by shape. 

How are we to interpret these results? 
Several authors have concluded that onto- 
logical boundaries literally shift to where 
language makes its cuts; that the substance 
versus object distinction works much like 
the categorical perception effects we no- 
ticed for phonemes (and perhaps colors; 
for an important statement, see Gentner & 
Boroditsky, 2001). Lucy and Gaskins (2001) 
bolster this interpretation with evidence that 
populations speaking different languages dif- 
fer increasingly in this regard with age. 
Whereas young Mayan speakers do not differ 
much from their English-speaking peers, by 
age nine years members of the two commu- 
nities differ significantly in relevant classifi- 
catory and memorial tasks. The implication 
is that long-term use of a language influ- 
ences ontology with growing conformance 
of concept grouping to linguistic group- 
ing. Of course, the claim is not for a ram- 
pant Procrustean reorganization of thought; 
only for boundary shifting. For displays 
that blatantly fall to one side or the other 
of the object/substance boundary, therefore, 
the speakers of all the tested languages sort 
the displays in the same ways. 


terpretations of such experiments are easy 
to attain at the present state of the art. For 
one, thing, Mazuka and Friedman (2000) 
failed to reproduce Lucy’s effects for Mayan 
versus English-speaking subjects’ classifica- 
tory performance in the predicted further 
case of Japanese. As these authors point 
out, the sameness in this regard between 
Japanese and English speakers, and the dif- 
ference in this regard between Mayan and 
English speakers, may best be thought of as 
arising from cultural and educational differ- 
ences between the populations rather than 
linguistic differences. 

In light of all the findings so far reviewed, 
there is another interpretation of the results 
that does not implicate an effect of language 
on thought but only an effect of language 
on language: One’s implicit understanding 
of the organization of a specific language 
can influence one’s interpretation of con- 
versation. Interpretations from this perspec- 
tive have been offered by many commen- 
tators. Bowerman (1996), Brown (1958), 
Landau and Gleitman (1985), and Slobin 
(1996, 2001) propose that native speakers 
not only learn and use the individual lexical 
items their language offers but also learn the 
kinds of meanings typically expressed by a 
particular grammatical category in their lan- 
guage and come to expect new members 
of that category to have similar meanings. 
Slobin calls this “typological bootstrapping.” 
Languages differ strikingly in their common 
forms and locutions — preferred fashions of 
speaking, to use Whorf’s phrase. These prob- 
abilistic patterns could bias the interpreta- 
tion of new words. Such effects occur in 
experiments when subjects are offered lan- 
guage input (usually nonsense words) un- 
der conditions in which implicitly known 
form-to-meaning patterns in the language 
might hint at how the new word is to be 
interpreted. 

Let us reconsider the Imai and Gentner 
object-substance effects on this hypothe- 
sis. As we saw, when the displays them- 
selves were of nonaccidental-looking hard- 
edged objects, subjects in both language 
groups opted for the object interpretation. 
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for softish waxy lima bean shapes), the lis- 
teners fell back upon linguistic cues, if avail- 
able. No relevant morphosyntactic clues ex- 
ist in Japanese, so Japanese subjects chose 
at random for these indeterminate stimuli. 
For English-speaking subjects, the linguis- 
tic stimulus in a formal sense also was in- 
terpretively neutral: This blicket is a tem- 
plate that accepts both mass and count 
nouns (this horse/toothpaste). But here prin- 
ciple and probability part company. Re- 
cent experimentation leaves no doubt that 
child and adult listeners incrementally ex- 
ploit probabilistic facts about word use to 
guide the comprehension process on line 
(e.g., Snedeker, Thorpe, & Trueswell, 2001). 
In the present case, any English speaker 
equipped with even a rough subjective prob- 
ability counter should take into account the 
massive preponderance of count nouns over 
mass nouns in English and conclude that a 
new word, blicket, used to refer to some in- 
determinate display, is probably a new count 
noun rather than a new mass noun. Count 
nouns, in turn, tend to denote individuals 
rather than stuff and so have shape pre- 
dictivity (Smith, 2001; Landau, Smith, & 
Jones, 1998). 

Applying this interpretation, it is not that 
speaking English leads one to tip the scales 
toward object representations of newly seen 
referents for perceptually ambiguous items, 
but that hearing English leads one to tip 
the scales toward count-noun representa- 
tion of newly heard nominals in linguis- 
tically ambiguous structural environments. 
Derivatively, then, count syntax hints at ob- 
ject representation of the newly observed 
referent. Notice that such effects can be 
expected to increase with age as massive 
lexical-linguistic mental databases are built, 
consistent with the findings of Lucy and 
Gaskins (2001).5 


Spatial Relationships 


Choi and Bowerman (1991) studied the ways 
in which common motion verbs in Korean 
differ from their counterparts in English. 
First, Korean motion verbs often contain lo- 


typically specified by a spatial preposition in 
English. To describe a scene in which a cas- 
sette tape is placed into its case, for exam- 
ple, English speakers would say “We put the 
tape in the case.” Korean speakers typically 
use the verb kkita to express the put in rela- 
tion for this scene. Kkita does not have the 
same extension as put in. Both put in and 
kkita describe an act of putting an object in 
a location; but put in is used for all cases of 
containment (fruit in a bowl, flowers in a 
vase) whereas kkita is used only in case the 
outcome is a tight fit between two matching 
shapes (tape in its case, one Lego piece on 
another, glove on hand). Notice that there 
is a cross-classification here: Whereas En- 
glish appears to collapse across tightnesses 
of fit, Korean makes this distinction but 
conflates across putting in versus putting on, 
which English regularly differentiates. Very 
young learners of these two languages have 
already worked out the language-specific 
classification of such motion relations and 
events in their language, as shown by both 
their usage and their comprehension (Choi 
& Bowerman, 1991). 

Do such cross-linguistic differences have 
implications for spatial cognition? Mc- 
Donough, Choi, and Mandler (2003) fo- 
cused on spatial contrasts between relations 
of tight containment versus loose support 
(grammaticalized in English by the prepo- 
sitions in and on and in Korean by the verbs 
kkita and nohta) and tight versus loose con- 
tainment (both grammaticalized as in in 
English but separately as kkita and nehta 
in Korean). They showed that prelinguis- 
tic infants (nine to fourteen months old) 
in both English- and Korean-speaking en- 
vironments are sensitive to such contrasts, 
and so are Korean-speaking adults (see also 
Hespos & Spelke, 2000, who show that five- 
month-olds are sensitive to this distinction). 
Their English-speaking adult subjects, how- 
ever, showed sensitivity only to the tight 
containment versus loose support distinc- 
tion, which is grammaticalized in English 
(in versus on). The conclusion drawn from 
these results was that some spatial relations 
that are salient during the prelinguistic stage 
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guage does not systematically encode them: 
“Flexible infants become rigid adults.” 

This interpretation again resembles that 
for the perception of phoneme contrasts, but 
by no means as categorically. The fact that 
English speakers learn and readily use verbs 
such as jam, pack, and wedge weakens any 
claim that the lack of common terms seri- 
ously diminishes the availability of catego- 
rization in terms of tightness of fit. One pos- 
sibility is that the observed language-specific 
effects with adults are attributable to verbal 
mediation: Unlike preverbal infants, adults 
may have turned the spatial classification 
task into a linguistic task. It therefore is use- 
ful to turn to studies that explicitly compare 
performance when subjects from each lan- 
guage group are instructed to classify ob- 
jects or pictures by name, as opposed to 
when they are instructed to classify the same 
objects by similarity. In one such study, Li 
et al. (1997) showed Korean- and English- 
speaking subjects pictures of events such as 
putting a suitcase on a table (an example 
of “on” in English, and of “loose support” 
in Korean). For half the subjects from each 
language group (each tested fully in their 
own language), these training stimuli were 
labeled by a videotaped cartoon character 
who performed the events (I am Miss Picky 
and I only like to put things on things. See?), 
and for the other subjects, the stimuli were 
described more vaguely (... and I only like to 
do things like this. See?). Later categorization 
of new instances followed language in the 
labeling condition: English speakers identi- 
fied new pictures showing tight fits (e.g., a 
cap put on a pen) as well as the original 
loose-fitting ones as belonging to the cate- 
gory that Miss Picky likes, but Korean speak- 
ers generalized only to new instances of loose 
fits. These language-driven differences rad- 
ically diminished in the similarity sorting 
condition in which the word (on or nohta) 
was not invoked; in this case the catego- 
rization choices of the two language groups 
were essentially the same. The “language on 
language” interpretation we commended in 
discussing the object/substance distinction 
in this case, too, seems to encompass the 


spatial relations. 


Motion 


Talmy (1985) described two styles of 
motion expression characterizing different 
languages: Some languages, including En- 
glish, typically use a verb plus a separate 
path expression to describe motion events. 
In such languages, manner of motion is en- 
coded in the main verb (e.g., walk, crawl, 
slide, or float), and path information appears 
in nonverbal elements such as particles, ad- 
verbials, or prepositional phrases (e.g., away, 
through the forest, out of the room). In Greek 
or Spanish, the dominant pattern instead 
is to include path information within the 
verb itself (e.g., Greek bjeno, “exit” and beno, 
“enter”); the manner of motion often goes 
unmentioned or appears in gerunds, preposi- 
tional phrases, or adverbials (trehontas, “run- 
ning”). These patterns are not absolute. 
Greek has motion verbs that express manner, 
and English has motion verbs that express 
path (enter, exit, cross). But several studies 
have shown that children and adults have 
learned these dominance patterns. Slobin 
(1996) showed that child and adult Spanish 
and English speakers vary in the terms they 
typically use to describe the same picture- 
book stories with English speakers display- 
ing greater frequency and diversity of man- 
ner of motion verbs. Papafragou, Massey, 
and Gleitman (2002) showed the same ef- 
fects for the description of motion scenes 
by Greek- versus English-speaking children 
and, much more strongly, for Greek-versus 
English-speaking adults. 

Do such differences in event encoding af- 
fect the way speakers think about motion 
events? Papafragou et al. (2002) tested their 
English- and Greek-speaking subjects on ei- 
ther memory of path or manner details of 
motion scenes, or categorization of motion 
events on the basis of path or manner sim- 
ilarities. Even though speakers of the two 
languages exhibited an asymmetry in encod- 
ing manner and path information in their 
verbal descriptions, they did not differ in 
terms of classification or memory for path 
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tained for Spanish versus English by Gen- 
nari et al. (2002). Corroborating evidence 
also comes from studies by Munnich, Lan- 
dau, and Dosher (2001), who compared En- 
glish, Japanese, and Korean speakers’ naming 
of spatial locations and their spatial memory 
for the same set of locations. They found 
that, even in aspects in which languages 
differed (e.g., encoding spatial contact or 
support), there was no corresponding dif- 
ference in memory performance across lan- 
guage groups. 

Relatedly, the same set of studies sug- 
gests that the mental representation of mo- 
tion and location is independent of linguis- 
tic naming even within a single language. 
Papafragou et al. (2002) divided their Eng- 
lish- and Greek-speaking subjects’ verbal de- 
scriptions of motion according to whether 
they included a path or manner verb, re- 
gardless of native language. Although En- 
glish speakers usually chose manner verbs, 
sometimes they produced path verbs; the 
Greek speakers also varied but with the pre- 
ponderances reversed. It was found that verb 
choice did not predict memory for path or 
manner aspects of motion scenes or choice 
of path or manner as a basis for catego- 
rizing motion scenes. In the memory task, 
subjects who had used a path verb to de- 
scribe a scene were no more likely to detect 
later path changes to that scene than sub- 
jects who had used a manner verb (and vice 
versa for manner). In the classification task, 
subjects were not more likely to name two 
motion events they had earlier categorized as 
most similar by using the same verb. Naming 
and cognition, then, are distinct under these 
conditions: Even for speakers of a single 
language, the linguistic resources mobilized 
for labeling underrepresent the cognitive 
resources mobilized for cognitive process- 
ing (eg., memorizing, classifying, reason- 
ing, etc.). 

An obvious conclusion from these stud- 
ies of motion representation is that the 
conceptual organization of space and mo- 
tion is robustly independent of language- 
specific labeling practices. Just as obvious, 
however, is that specific language usage 


speaker's intended meaning if the stimu- 
lus situation leaves such interpretation unre- 
solved. In another important demonstration 
of this language-on-language effect, Naigles 
and Terrazas (1998) asked subjects to de- 
scribe and categorize videotaped scenes — 
for example, of a girl skipping toward a 
tree. They found that Spanish- and English- 
speaking adults differed in their preferred 
interpretations of new (nonsense) motion 
verbs in manner-biasing (She’s kradding to- 
ward the tree or Ella esta mecando hacia el 
arbol) or path-biasing (She’s kradding the tree 
or Ella esta mecando el arbol) sentence struc- 
tures. The interpretations were heavily influ- 
enced by syntactic structure. But judgments 
also reflected the preponderance of verbs in 
each language — Spanish speakers gave more 
path interpretations and English speakers 
gave more manner interpretations. Similar 
effects of language-specific lexical practices 
on presumed word extension have been 
found for adjectives (Waxman, Senghas, & 
Benveniste, 1997). 

A fair conclusion from this and related 
evidence is that verbal descriptions are un- 
der the control of many factors related to 
accessibility, including the simple frequency 
of a word’s use, as well as of faithfulness 
as a description of the scene. As several 
authors have argued, the dynamic process 
of expressing one’s thoughts is subject to 
the exigencies of linguistic categories that 
can vary from language to language. Given 
the heavy information-processing demands 
of rapid conversation, faithfulness often is 
sacrificed to accessibility. For these and other 
reasons, verbal reports do not come any- 
where near exhausting the observers’ mental 
representations of events. Language use, in 
this sense, is “sketchy.” Rather than “think- 
ing in words,” humans seem to make easy lin- 
guistic choices that, for competent listeners, 
serve as rough but usually effective pointers 
to those ideas. 


Spatial Frames of Reference 


Certain linguistic communities (e.g., Tene- 
japan Mayans) customarily use an externally 
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tem to refer to nearby directions and po- 
sitions (“to the north”); others (e.g., Dutch 
speakers) use a viewer-perspective (relative) 
system (“to the left”). Brown and Levinson 
(1993) and Pederson et al. (1998) recently 
suggested that these linguistic practices af- 
fect spatial reasoning in language-specific 
ways. In one of their experiments, Tenejapan 
Mayan and Dutch subjects were presented 
with an array of objects (toy animals) on a 
tabletop; after a brief delay, subjects were 
taken to the opposite side of a new table 
(they were effectively rotated 180 degrees), 
handed the toys, and asked to reproduce the 
array “in the same way as before.” The over- 
whelming majority of Tenejapan (absolute) 
speakers rearranged the objects so they were 
heading in the same cardinal direction after 
rotation, whereas Dutch (relative) speakers 
massively preferred to rearrange the objects 
in terms of left-right directionality. This co- 
variation of linguistic terminology and spa- 
tial reasoning seems to provide compelling 
evidence for linguistic influences on nonlin- 
guistic cognition. 

As so often is the case in this literature, 
however, it is quite hard to disentangle cause 
and effect. For instance, it is possible that 
the Tenejapan and Dutch groups think about 
space differently because their languages 
pattern differently; but it is just as possi- 
ble that the two linguistic-cultural groups 
developed different spatial-orientational vo- 
cabulary to reflect (rather than cause) dif- 
ferences in their spatial reasoning strate- 
gies. Li and Gleitman (2002) investigated 
this second position. They noted that ab- 
solute spatial terminology is widely used in 
many English-speaking communities whose 
environment is geographically constrained 
and includes large stable landmarks such as 
oceans and looming mountains. The abso- 
lute terms uptown, downtown, and crosstown 
(referring to North, South, and East-West) 
are widely used to describe and navigate in 
the space of Manhattan Island, Chicagoans 
regularly make absolute reference to the 
lake, etc. It is quite possible, then, that the 
presence or absence of stable landmark in- 
formation rather than language spoken in- 


coordinate frameworks. After all, the influ- 
ence of such landmark information on spa- 
tial reasoning has been demonstrated with 
nonlinguistic (rats; Restle, i957) and prelin- 
guistic (infants; Acredolo & Evans, 1980) an- 
imals. To examine this possibility, Li and 
Gleitman replicated Brown and Levinson’s 
rotation task with English speakers, but they 
manipulated the presence or absence of 
landmark cues in the testing area. The re- 
sult, just as for the rats and the infants, was 
that English-speaking adults respond abso- 
lutely in the presence of landmark informa- 
tion (after rotation, they set up the animals 
going in the same cardinal direction) and rel- 
atively when it is withheld (they set up the 
animals going in the same relative — left or 
right — direction). 

Flexibility in spatial reasoning in this re- 
gard should come as little surprise. The abil- 
ity to navigate in space is hard-wired in the 
brain of moving creatures, including bees 
and ants. For all of these organisms, re- 
liable orientation and navigation in space 
are crucial for survival (Gallistel, 1990). 
Accordingly, neurobiological evidence from 
humans and other species that the brain 
routinely uses a multiplicity of coordinate 
frameworks in coding for the position of ob- 
jects to prepare for directed action (Gallistel, 
2002). It would be quite amazing if, among 
all the creatures that walk, fly, and crawl on 
the earth, only humans, by virtue of acquir- 
ing a particular language, lose the ability to 
use both absolute and relative spatial coordi- 
nate frameworks flexibly. The case is by no 
means closed even on this issue, however, 
because successive probes of the rotation 
situation have continued to yield conflict- 
ing results both within and across languages 
(eg., Levinson, Kita, & Haun, 2002; Li & 
Gleitman, in preparation]. One way of rec- 
onciling these findings and theories has to 
do with the level of analysis to which the 
Levinson groups’ findings are thought to ap- 
ply. Perhaps we are prisoners of language 
only in complex and highly derived tasks and 
only when behavior is partly under the con- 
trol of verbal instructions that include vague 
expressions such as “make it the same.” But 
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phenomenon. 


Evidentiality 


One of Whorf’s most interesting conjectures 
concerned the possible effects of evidentials 
(linguistic markers of information source) on 
the nature of thought. Whorf pointed out 
that Hopi — unlike English — marked evi- 
dential distinctions in its complementizer 
system. Comparing the sentences I see that 
it is red vs. I see that it is new, he remarked 
(Whorf, 1956, p. 85): 


We fuse two quite different types of rela- 
tionship into a vague sort of connection ex- 
pressed by ‘that’, whereas the Hopi indi- 
cates that in the first case seeing presents a 
sensation ‘red,’ and in the second that see- 
ing presents unspecified evidence for which 
is drawn the inference of newness. 


Whorf concluded that this grammatical 
feature was bound to make certain concep- 
tual distinctions easier to draw for the Hopi 
speaker because of the force of habitual lin- 
guistic practices. 

Papafragou, Li, Choi, and Han (in 
preparation) sought to put this proposal 
to test. They compared English, which 
mainly marks evidentiality lexically (I 
saw/heard/inferred that John left), with 
Korean, in which evidentiality is encoded 
through a set of dedicated morphemes. 
Given evidence that such morphemes are 
produced early by children learning Korean 
(Choi, 1995), they asked whether Korean 
children develop the relevant conceptual 
distinctions earlier and with greater reli- 
ability than learners of English, in which 
evidentiality is not grammatically encoded. 
In a series of experiments, they compared 
the acquisition of nonlinguistic distinctions 
between sources of evidence in three- and 
four-year-olds learning English or Korean: 
No difference in nonlinguistic reasoning in 
these regards was found between the English 
and Korean group. For instance, children in 
both linguistic groups were equally good 
at reporting how they found out about the 
contents of a container (e.g., by looking 
inside or by being told); both groups were 


tents of a container to a character who had 
looked inside but not to another character 
who had had no visual access to its content. 
Furthermore, Korean learners were more 
advanced in their nonlinguistic knowledge 
of sources of information than in their 
knowledge of the meaning of linguistic evi- 
dentials. In this case, then, learned linguistic 
categories do not seem to serve as a guide 
for the individual’s nonlinguistic categories 
in the way that Whorf conjectured. Rather, 
the acquisition of linguistically encoded 
distinctions seems to follow, and build upon, 
the conceptual understanding of evidential 
distinctions. The conceptual understanding 
itself appears to proceed similarly across 
diverse language-learning populations. 


Time 

Thus far, we have focused on grammati- 
cal and lexical properties of linguistic sys- 
tems and their possible effects on conceptual 
structure. Here we consider another aspect 
of languages as expressive systems — their 
systematically differing use of certain net- 
works of metaphor; specifically, metaphor 
for talking about time (Boroditsky, 2001). 
English speakers predominantly talk about 
time as if it were horizontal (one pushes 
deadlines back, expects good times ahead, or 
moves meetings forward), whereas Mandarin 
speakers more usually talk about time in 
terms of a vertical axis (they use the Man- 
darin equivalents of up and down to refer 
to the order of events, weeks, or months). 
Boroditsky showed that these differences 
predict aspects of temporal reasoning by 
speakers of these two languages. In one of 
her manipulations, subjects were shown two 
objects in vertical arrangement, say, one fish 
following another one downward, as they 
heard something like The black fish is win- 
ning. After this vertically oriented prime, 
Mandarin speakers were faster to confirm 
or disconfirm temporal propositions (e.g., 
March comes earlier than April) than if they 
were shown the fish in a horizontal array. 
The reverse was true for English speakers. 
Boroditsky concluded that spatiotemporal 
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reason about time. She has suggested, more 
generally, that such systematic linguistic 
metaphors are important in shaping habit- 
ual patterns of thought. 

However, these results are again more 
complex than they seem at first glance. For 
one thing, and as Boroditsky acknowledges, 
vertical metaphors of time are by no means 
absent from ordinary English speech (e.g., I 
have a deadline coming up), although they are 
more sporadic than in Mandarin. So again we 
have a cross-linguistic difference of degree, 
rather than a principled opposition. More- 
over, Boroditsky briefly trained her English- 
speaking subjects to think about time ver 
tically, as in Mandarin. After such training, 
the English speakers exhibited the vertical 
(rather than the former horizontal) priming 
effect. Apparently, fifteen minutes of train- 
ing on the vertical overcame and completely 
reversed twenty-plus years of the habitual 
use of the horizontal in these speakers. The 
effects of metaphor, it seems, are transient 
and fluid without long-term influence on 
the nature of conceptualization or its im- 
plicit deployment to evaluate propositions in 
real time. 


Number 


Prelinguistic infants and nonhuman pri- 
mates share an ability to represent both ex- 
act numerosities for very small sets (roughly 
up to three objects) and approximate nu- 
merosities for larger sets (Dehaene, 1997). 
Human adults possess a third system for rep- 
resenting number that allows for the rep- 
resentation of exact numerosities for large 
sets; in principle has no upper bound on set 
size; and can support the comparison of nu- 
merosities of different sets, as well as pro- 
cesses of addition and subtraction. Crucially, 
this system is generative because it possesses 
a rule for creating successive integers (the 
successor function) and therefore is charac- 
terized by discrete infinity (see Gallistel & 
Gelman, Chap. 23). 

How do young children become capable 
of using this uniquely human number 
system? One powerful answer is that 


number system are innate; gaining access 
to these principles gives children a way of 
grasping the infinitely discrete nature of 
natural numbers, as manifested by their 
ability to use verbal counting (Gelman & 
Gallistel, 1978; Gallistel & Gelman, Chap. 
23). Other researchers propose that children 
come to acquire the adult number system 
by conjoining properties of the two prelin- 
guistic number systems via natural language. 
Specifically, they propose that grasping the 
linguistic properties of number words (e.g., 
their role in verbal counting or their seman- 
tic relations to quantifiers such as few, all, 
many, most; see Spelke & Tsivkin, 2001a and 
Bloom, 1994b; Carey, 2001 respectively) 
enables children to put together elements of 
the two previously available number systems 
to create a new, generative number faculty. 
In Bloom’s (1994b, p. 186) words, “in the 
course of development, children ‘bootstrap’ 
a generative understanding of number out of 
the productive syntactic and morphological 
structures available in the counting system.” 

Upon hearing the number words in a 
counting context, for instance, children re- 
alize that these words map onto both 
specific representations delivered by the 
exact-numerosities calculator and inexact 
representations delivered by the approxima- 
tor device. By conjoining properties of these 
two systems, children gain insight into the 
properties of the adult conception of num- 
ber (e.g., that each of the number words 
picks out an exact set of entities, that adding 
or subtracting exactly one object changes 
number, etc.). Ultimately, it is hypothesized 
that this process enables the child to com- 
pute exact numerosities even for large sets 
(such as seven or twenty-three) — an ability 
not afforded by either of the prelinguistic 
calculation systems. 

Spelke and Tsivkin (2001a, b) experi- 
mentally investigated the thesis that lan- 
guage contributes to exact large-number cal- 
culations. In their studies, bilinguals who 
were trained on arithmetic problems in a 
single language and later tested on them 
were faster on large-number arithmetic if 
tested in the training language; however, no 
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peared with estimation problems. The con- 
clusion from this and related experiments 
was that the particular natural language is 
the vehicle of thought concerning large ex- 
act numbers but not about approximate nu- 
merosities. Such findings, as Spelke and her 
collaborators have emphasized, can be part 
of the explanation of the special “smartness” 
of humans. Higher animals, like humans, can 
reason to some degree about approximate 
numerosity, but not about exact numbers. 
Beyond this shared core knowledge, how- 
ever, humans have language. If language is 
a required causal factor in exact number 
knowledge, in principle this could explain 
the gulf between creatures like us and crea- 
tures like them. 

How plausible is the view that the 
adult number faculty presupposes linguis- 
tic mediation? Recall that, on this view, 
children infer the generative structure of 
number from the generative structure of 
grammar when they hear others count- 
ing. However, counting systems vary cross- 
linguistically, and in a language like English, 
their recursive properties are not really ob- 
vious from the outset. Specifically, until 
number eleven, the English counting sys- 
tem presents no evidence of regularity, much 
less of generativity: A child hearing one, two, 
three, four, five, six, wp to eleven, would have 
no reason to assume — based on properties 
of form — that the corresponding numbers 
are lawfully related (namely, that they suc- 
cessively increase by one). For larger num- 
bers, the system is more regular, even though 
not fully recursive because of the presence 
of several idiosyncratic features (e.g., one 
can say eighteen or nineteen but not tenteen 
for twenty). In sum, it is not so clear how 
the “productive syntactic and morphological 
structures available in the counting system” 
will provide systematic examples of discrete 
infinity that can then be imported into num- 
ber cognition (see Grinstead et al., 2003, for 
detailed discussion). 

Can properties of other natural language 
expressions bootstrap a generative under- 
standing of number? Quantifiers have been 


2001). However, familiar quantifiers lack the 
hallmark properties of the number system: 
They are not strictly ordered with respect 
to one another, and their generation is not 
governed by the successor function. In fact, 
several quantifiers presuppose the compu- 
tation of cardinality of sets — for example, 
neither and both apply only to sets of two 
items (Keenan & Stavi, 1956; Barwise & 
Cooper, 1981). Moreover, quantifiers com- 
pose in quite different ways from numbers. 
For example, the expression most men and 
women cannot be interpreted to mean a large 
majority of the men and much less than half 
the women (A. Joshi, personal communica- 
tion). In light of the semantic disparities be- 
tween the quantifier and integer systems, it 
is hard to see how it is possible to bootstrap 
the semantics of one from the other. 
Recent experimental findings suggest, 
moreover, that young children understand 
certain semantic properties of number 
words well before they know those of quan- 
tifiers. One case involves the scalar interpre- 
tation of these terms. In one experiment, 
Papafragou and Musolino (2003) had five- 
year-old children watch as three horses were 
shown jumping over a fence. The children 
would not accept Two of the horses jumped 
over the fence as an adequate description of 
that event (even though it is necessarily true 
that if three horses jumped, then certainly 
two did). But at the same age, they will 
accept Some of the horses jumped over the 
fence as an adequate description even though 
it is true that all of the horses jumped. In 
another experiment, Hurewitz, Papafragou, 
Gleitman, and Gelman (in review) found 
that three-year-olds understand certain se- 
mantic properties of number words such as 
two and four well before they know those 
of quantifiers such as some and all. It seems, 
then, that the linguistic systems of number 
and natural-language quantification are de- 
veloping rather independently. If anything, 
the children seem more advanced in knowl- 
edge of the meaning of number words than 
quantifiers so it is hard to see how the 
semantics of the former lexical type is to 
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latter. 


Orientation 


A final domain we discuss is spatial orien- 
tation. Cheng and Gallistel (1984) found 
that rats rely on geometric information to 
reorient themselves in a rectangular space, 
and seem incapable of integrating geomet- 
rical with nongeometrical properties (e.g., 
color, smell, etc.) in searching for a hidden 
object. If they see food hidden at the cor 
ner of a long and a short wall, they will 
search equally at either of the two such walls 
of a rectangular space after disorientation; 
this is true even if these corners are dis- 
tinguishable by one of the long walls be- 
ing painted blue or having a special smell, 
and so on. Hermer and Spelke (1994, 1996) 
reported a very similar difficulty in young 
children. Both animals and young children 
can navigate and reorient by the use of ei- 
ther geometric or nongeometric cues; it is 
integrating across the cue types that creates 
trouble. These difficulties are overcome by 
older children and adults, who are able, for 
instance, to go straight to the corner formed 
by a long wall to the left and a short blue 
wall to the right. Hermer and Spelke found 
that success in these tasks was significantly 
predicted by the spontaneous combination 
of spatial vocabulary and object properties 
such as color within a single phrase (e.g., to 
the left of the blue wall).7 Later experiments 
(Hermer-Vasquez, Spelke, and Katsnelson, 
1999) revealed that adults who were asked 
to shadow speech had more difficulty in 
these orientation tasks than adults who were 
asked to shadow a rhythm with their hands; 
however, verbal shadowing did not disrupt 
subjects’ performance in tasks that required 
the use of nongeometric information only. 
The conclusion was that speech-shadowing, 
unlike rhythm-shadowing, by taking up lin- 
guistic resources, blocked the integration of 
geometrical and object properties, which is 
required to solve complex orientation tasks. 
In short, success at the task seems to require 


cally linguistic format. 

In a recent review article, Carruthers 
(2002) suggests even more strongly that in 
number, space, and perhaps other domains, 
language is the medium of intermodular 
communication, a format in which repre- 
sentations from different domains can be 
combined to create novel concepts. In stan- 
dard assumptions about modularity, how- 
ever, modules are characterized as compu- 
tational systems with their own proprietary 
vocabulary and combinatorial rules. Because 
language itself is a module in this sense, 
its computations and properties (e.g., gen- 
erativity, compositionality) cannot be trans- 
ferred to other modules because they are 
defined over — and can only apply to - 
language-internal representations. One way 
out of this conundrum is to give up the as- 
sumption that language is — on the appropri- 
ate level — modular: 


Language may serve as a medium 
for this conjunction...because it is a 
domain-general, combinatorial system to 
which the representations delivered by the 
child's ... [domain-specific] nonverbal sys- 
tems can be mapped. (Spelke & Tsivkin, 
2001), p. 84). 


Language is constitutively involved in 
(some kinds of) human thinking. Specifi- 
cally, language is the vehicle of nonmodu- 
lar, nondomain-specific, conceptual think- 
ing which integrates the results of modular 
thinking (Carruthers, 2002, p. 666). 


On this view, the output of the linguistic 
system just is Mentalese: There is no other 
level of representation in which the infor- 
mation to the left of the blue wall can be en- 
tertained. This picture of language is novel 
in many respects. In the first place, replac- 
ing Mentalese with a linguistic representa- 
tion challenges existing theories of language 
production and comprehension. Tradition- 
ally, and as discussed earlier, it is assumed 
the production of sentences begins by en- 
tertaining the corresponding thought, which 
then mobilizes the appropriate linguistic re- 
sources for its expression (e.g., Levelt, 1989). 
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(2002, p. 668) observed: 


We cannot accept that the production of a 
sentence ‘The toy is to the left of the blue 
wall’ begins with a tokening of the thought 
THE TOY IS TO THE LEFT OF THE 
BLUE WALL (in Mentalese), since our hy- 
pothesis is that such a thought cannot be 
entertained independently of being framed 
in a natural language. 


Inversely, language comprehension clas- 
sically is taken to unpack linguistic repre- 
sentations into mental representations that 
then can trigger further inferences. But in 
Carruthers’ proposal, after hearing The toy 
is to the left of the blue wall, the interpretive 
device cannot decode the message into the 
corresponding thought because there is no 
level of Mentalese independent of language 
in which the constituents are lawfully con- 
nected to each other. Interpretation can only 
dismantle the utterance and send its con- 
cepts back to the geometric and landmark 
modules to be processed. In this sense, un- 
derstanding an utterance such as The picture 
is to the right of the red wall turns out to be 
a very different process than understanding 
superficially similar utterances such as The 
picture is to the right of the wall, or The pic- 
ture is on the red wall, which do not, on this 
account, require cross-domain integration. 

Furthermore, if language is to serve as a 
domain for cross-module integration, then 
the lexical resources of each language be- 
come crucial for conceptual combination. 
Lexical gaps in the language will block con- 
ceptual integration, for instance, because 
there would be no relevant words to insert 
into the linguistic string. We know that color 
terms vary across languages (Kay & Regier, 
2002); more relevantly, not all languages 
have terms for left and right (Levinson, 
1996). It follows that speakers of these lan- 
guages should fail to combine geometric and 
object properties in the same way as do En- 
glish speakers to recover from disorientation. 
In other words, depending on the spatial vo- 
cabulary available in their language, disori- 
ented adults may behave either like Spelke 
and Tsivkin’s English-speaking population 


prediction, although merely carrying the 
original proposal to its apparent logical con- 
clusion, is quite radical: It allows a striking 
discontinuity among members of the human 
species, contingent not upon the presence or 
absence of human language and its combi- 
natorial powers (as the original experiments 
seem to suggest) or even upon cultural and 
educational differences, but on vagaries of 
the lexicon in individual linguistic systems. 
Despite its radical entailments, there is a 
sense in which Spelke’s proposal to inter- 
pret concept configurations on the basis of 
the combinatorics of natural language can be 
construed as decidedly nativist. In fact, we so 
construe it. Spelke’s proposal requires that 
humans be equipped with the ability to con- 
struct novel structured syntactic represen- 
tations, insert lexical concepts at the termi- 
nal nodes of such representations (left, blue, 
etc.), and interpret the outcome on the ba- 
sis of familiar rules of semantic composition 
(to the left of the blue wall). In other words, 
humans are granted principled knowledge of 
how phrasal meaning is to be determined by 
lexical units and the way they are composed 
into structured configurations. That is, what 
is granted is the ability to read the seman- 
tics off of phrase structure trees. Further, the 
assumption is that this knowledge is not at- 
tained through learning but belongs to the 
in-built properties of the human language 
device. But notice that granting humans the 
core ability to build and interpret phrase 
structures is already granting them quite a 
lot. Exactly these presuppositions have been 
the hallmark of the nativist program in lin- 
guistics and language acquisition (Chomsky, 
1957; Pinker, 1984; Gleitman, 1990; Lidz, 
Gleitman, & Gleitman, 2002; Jackendoff, 
1990) and the target of vigorous dissent else- 
where (Tomasello, 2000; Goldberg, 1995). 
To the extent that Spelke and Tsivkin’s ar- 
guments about language and cognition rely 
on the combinatorial and generative powers 
of language, they already make quite deep 
commitments to abstract (and unlearnable) 
syntactic principles and their semantic re- 
flexes. Notice in this regard that because 
these authors hold that any natural language 
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required inferences, the principles at work 
here must be abstract enough to wash out 
the diverse surface-structural realizations of 
to the left of the blue wall in the languages of 
the world. Independently of particular expe- 
riences, an organism with such principles in 
place could generate and systematically com- 
prehend novel linguistic strings with mean- 
ings predictable from the internal organiza- 
tion of those strings and, for different but 
related reasons, just as systematically fail to 
understand other strings such as to the left of 
the blue idea. We would be among the last 
to deny such a proposal in its general form. 
We agree that there are universal aspects 
of the syntax-semantics interface. Whether 
these derive from or augment the combina- 
torial powers of thought is the question at 
issue here. For the present commentators, it 
is hard to see how shifting the burden of the 
acquisition of compositional semantics from 
the conceptual system to the linguistic sys- 
tem diminishes the radical nativist flavor of 
the position. 


Conclusions and Future Directions 


We have just tried to review the burgeoning 
psychological and anthropological literature 
that attempts to relate language to thought. 
We began with the many difficulties in- 
volved in radical versions of the linguistic rel- 
ativity position, including the fact that lan- 
guage seems to underspecify thought and to 
diverge from it regarding the treatment of 
ambiguity, paraphrase, and deictic reference. 
Moreover, there is ample evidence that sev- 
eral forms of cognitive organization are in- 
dependent of language: Infants who have no 
language are able to entertain relatively com- 
plex thoughts; for that matter, they can learn 
languages or even invent them when the 
need arises (Goldin-Meadow, 2003; Senghas 
et al., 1997). Many bilinguals, as a mat- 
ter of course, “code-switch” between their 
known languages even during the utterance 
of a single sentence (Joshi, 1985). Aphasics 
sometimes exhibit impressive propositional 


can form representations of space, artifacts, 
and perhaps even mental states without 
linguistic crutches (Hauser & Carey, 1998; 
Gallistel, 1990; Hare, Call, & Tomasello, 
2001; and Call & Tomasello, Chap. 25). In 
light of all these language-thought dispari- 
ties, it would seem perverse to take an equa- 
tive position on relations between the two. 

At the same time, compelling experi- 
mental studies again and again document 
intimate, seemingly organic, relationships 
among language, thought, and culture, of 
much the kind that Whorf and Sapir drew 
out of their field experiences. What is to 
explain these deep correlations between 
culturally divergent ways of thinking and 
culturally divergent ways of talking? In cer- 
tain cases, we argued that cause and effect 
had simply been prematurely placed on one 
foot or another because of the crudeness 
of our investigative tools. Inconveniently 
enough, it is often hard to study language 
development apart from conceptual and 
cultural learning or to devise experiments in 
which these factors can be prevented from 
interacting, so it is hard to argue back to 
origins. On the other hand, the difficulty 
of even engineering such language-thought 
dissociations in the laboratory is one signifi- 
cant point in favor of a linguistic—relativistic 
view. Why should it be so hard to pry them 
apart if they are so separate? 

Over the course of the discussion, our 
reading of the evidence provides source 
global support for what we take to be the “ty- 
pological bootstrapping” and “thinking for 
speaking” positions articulated in various 
places by Slobin [1996; 2001; 2003, inter 
alia]. Language influences thought “on line” 
and in many ways. For the learner, the par- 
ticular speech events that one experiences 
can and do provide cues to nonlinguistic 
categorization — that is, a new linguistic la- 
bel “invites” the learner to attend to certain 
types of classification criteria over others. 
Markman and Hutchinson (1984) found that 
if one shows a two-year-old a new object 
and says See this one; find another one, the 
child typically reaches for something that 
has a spatial or encyclopedic relation to the 
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original objed? (ESR Any Depa dies ire 
the dog). But if one uses a new word (See this 
fendle, find another fendle), the child typically 
looks for something from the same category 
(eg., finding another dog to go with the 
first dog). Similar effects have been obtained 
with much younger children: Balaban and 
Waxman (1997) showed that labeling can fa- 
cilitate categorization in infants as young as 
nine months (cf. Xu, 2002). Beyond catego- 
rization, labeling has been shown to guide 
infants’ inductive inference (eg., expecta- 
tions about nonobvious properties of novel 
objects), even more so than perceptual sim- 
ilarity (Welder & Graham, 2001). Other re- 
cent experimentation shows that labeling 
may help children solve spatial tasks by 
pointing to specific systems of spatial rela- 
tions (Loewenstein & Gentner, 2003). For 
learners, then, the presence of linguistic la- 
bels constrains criteria for categorization and 
serves to foreground a codable category out 
of all the possible categories to which a stim- 
ulus could be said to belong. 

To what extent these linguistic influences 
result in mere tweaks — slight shifts in the 
boundaries between categories — or to more 
radical reorganizations of the learners’ con- 
ceptual world (as in the reorganizational 
principles that stand between phonetics and 
phonology) is hard to say at the present 
time. For competent adult users, thinking for 
speaking effects arise again to coax the lis- 
tener toward certain interpretations of the 
speech he or she is hearing as a function 
of probabilistic features of a particular lan- 
guage. The clearest example in the analy- 
sis we presented is the series of inferences 
that lead to different cross-linguistic catego- 
rizations of novel not-clearly-individuatable 
stimulus items with nonsense names: If it is 
an English noun, it is probably an English 
count-noun; if it is an English count-noun, it 
is probably naming an individuatable object. 

It appears to us that much discussion 
about the relationship between language and 
thought has been colored by an underlying 
disagreement about the nature of language 
itself, Many commentators, struck by ob- 
served cross-linguistic diversity in semantic 
and syntactic categories, have taken this di- 
versity as a possible source of deeper cogni- 


¢yMardiscontinuities among speakers of dif- 


ferent languages. But other commentators 
see this cross-linguistic diversity as much 
more limited and superficial than the bloom- 
ing, buzzing confusion coming out of the 
tower of Babel. For instance, many stud- 
ies in morphosyntax show that apparently 
distinct surface configurations of linguistic 
elements in different languages can be ana- 
lyzed in terms of underlying structural simi- 
larities (Chomsky, 2000; Baker, 2001). Stud- 
ies in linguistic semantics suggest that the 
properties and meanings of syntactic entities 
(e.g., determiners) are severely constrained 
cross-linguistically (Keenan & Stavi, 1986). 
Many of these principles of language organi- 
zation seem to map quite transparently from 
core knowledge of the kinds studied in in- 
fants (e.g., Quinn, 2001; Baillargeon, 1993; 
and other sources mentioned throughout). 
For instance, scenes of kangaroos jumping 
come apart into the kangaroo (argument) 
part and jumping (predicate) part in every 
natural language, but also in the prelinguis- 
tic parsing of events by children, including 
those learning language under circumstances 
of extreme linguistic and sensory deprivation 
(e.g., blind or isolated deaf children: Goldin- 
Meadow, 2003; Landau & Gleitman, 1985; 
Senghas et al., 1997). Focus on this kind 
of evidence suggests that cross-linguistic di- 
versity is highly constrained by rich and 
deep underlying similarities in the nature of 
thought. Thus, rather than pointing to cogni- 
tive discontinuities among speakers of differ- 
ent languages, cross-linguistic diversity could 
reveal principled points of departure from 
an otherwise common linguistic-conceptual 
blueprint humans share as a consequence of 
their biological endowment. 


Acknowledgments 


We thank Jerry Fodor for a discussion of 
the semantics of raining, Ray Jackendoff 
for a discussion of phonology, as well as 
Dedre Gentner for her comments on this 
chapter. Much of our perspective derives 
from our collaborative work with Cynthia 
Fisher, Henry Gleitman, Christine Massey, 


LANGUAGE AND THOUGHT 655 


KimberlPGRasse RENT pp saetiilipnacg7com its words start out as mass nouns and be- 


Barbara Landau. Writing of this chapter 
was supported by NIH Grant No. 1-Ro1- 
HD37507-02 to J. Trueswell and L. R. Gleit- 
man and NIH Grant No. 1F32MH6s5020- 
o1A2 to A. Papafragou. 


Notes 


1. In one experimental demonstration, subjects 
were asked: When an airplane crashes, where 
should the survivors be buried? They rarely no- 
ticed the meaning discrepancy in the question 
(Barton & Sanford, 1996). 


2. The similarity test may not be decisive for this 
case, as Malt, Sloman, and Gennari (2003), as 
well as Smith, Colunga, and Yoshida (2001), 
among others, have pointed out. Similarity 
judgments applied as the measuring instru- 
ment could systematically mask various non- 
perceptual determinants of organization in 
a semantic—conceptual domain, some poten- 
tially language-caused. Over the course of this 
chapter, we will return to consider other do- 
mains and other psychological measures. For 
further discussion of the sometimes arbitrary 
and linguistically varying nature of the lexi- 
con, even in languages that are typologically 
and historically closely related, see Kay (1996). 
He points out, for example, that English speak- 
ers use screwdriver whereas the Germans use 
Schraubenzieher (literally, “screwpuller”), and 
the French tournevise (literally, “screwturner” 
for the same purposes; our turnpike exit-entry 
points are marked exit, whereas the Brazilians 
have entradas; and so forth. 


3. Categorical perception for speech sounds has 
been documented for other species, includ- 
ing chinchillas and macaques (eg., Kuhl & 
Miller, 1978). Moreover, studies from Kay and 
Kempton (1984) and Roberson, Davies, and 
Davidoff (2000) suggest that even for hue 
perception, the relationship between linguis- 
tic and perceptual categorization is not so 
clear with categorical perception effects ob- 
tained or not obtained depending on very 
delicate choices of experimental procedure 
and particular characteristics of the stimulus. 
For an important review, see Munnich and 
Landau (2003). 

4. This argument is not easy. One might argue 
that English is a classifier language much like 
Yucatec Mayan or Japanese — that is, that all 


come countable entities only through adding 
the classifiers the and a (compare brick the sub- 
stance to a brick, the object). Detailed linguis- 
tic analysis, however, suggests there is a gen- 
uine typological difference here (Slobin, 2001 
and Lucy & Gaskins, 2001; Chierchia, 1998; 
Krifka, i995, for discussion). The question is 
whether, because all languages formally mark 
the mass or count distinction in one way or 
another, the difference in particular linguis- 
tic means could plausibly rebound to impact 
ontology. 


. We should point out that this hint, at best, is a 


weak one, another reason why the observed 
interpretive difference for Japanese and En- 
glish speakers, even at the perceptual midline, 
is also weak. Notoriously, English often violates 
the semantic generalization linking mass noun 
morphology with substancehood (compare, 
for example, footwear, silverware, furniture). 


. Subsequent analysis of the linguistic data re- 


vealed that Greek speakers were more likely 
to include manner of motion in their ver- 
bal descriptions when manner was unexpected 
or noninferable, whereas English speakers in- 
cluded manner information regardless of in- 
ferability (Papafragou, Massey, & Gleitman, 
2003). This suggests that speakers may mon- 
itor harder-to-encode event components and 
choose to include them in their utterances 
when especially informative. This finding rein- 
forces the conclusion that verbally encoded as- 
pects of events vastly underdetermine the sub- 
tleties of event cognition. 


. Further studies show that success in this task 


among young children is sensitive to the size 
of the room: In a large room, more four-year- 
olds succeed in combining geometric and land- 
mark information (Learmonth, Nadel, & New- 
combe, in press). Moreover, it is claimed that 
other species (chickens, monkeys) can use both 
types of information when disoriented (Val- 
lortigara, Zanforlin, & Pasti, 1990; Gouteux, 
Thinus-Blanc, & Vauclair, in press). For discus- 
sion, see Carruthers (2002). 
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CHAPTER 27 


Paradigms of Cultural Thought 


Patricia M. Greenfield 


Two Paradigms of Thought: 
Phenomena, Theory, and 
Methodology 


In 1963, Jerome Bruner gave me the chance 
of a lifetime — to go to Senegal to do my dis- 
sertation on relations between culture and 
the development of thought. While there I 
made an unexpected discovery, one that led 
me into two radically different paradigms of 
cultural thought. I found that unschooled 
Wolof children, participating in a classic Pi- 
agetian conservation task, were unable to re- 
ply to the question, “Why do you think (or 
say) this glass has more (or equal) water?”; 
yet they quickly answered an alternative 
form of the question: “Why does this glass 
have more (or equal) water?” (Greenfield, 
1966). U.S. or Swiss children, of course, 
had no difficulty in understanding the first 
question. Neither did Wolof schoolchildren. 
What did this difference mean? At first this 
seemed to be a methodological problem. 
Later I realized it was a reflection of deep 
differences in cultural psychology: In pro- 
viding a reason for their thoughts or words, 


Western and Wolof school children were dis- 
playing psychological mindedness; they dis- 
tinguished between their own thought or 
statement about something and the thing 
itself In contrast, the unschooled Wolof 
children were not making this distinction. 
They were assuming the world on one plane 
with thought and object of thought as one 
unified reality. 

I am going to use this difference to pro- 
vide some historical background for the the- 
oretical theme of this chapter —that there are 
two major paradigms of cultural thought, an 
individualistic one and a collectivistic one, 
and that each is part of a larger pathway 
of development that encompasses the social 
as well as the cognitive (Greenfield et al., 
2003). Although this theme leads to a very 
selective review of research on culture and 
thinking, it also provides theoretical coher- 
ence for a diverse body of literature. 

I took the terminology of individual- 
ism and collectivism from anthropologists 
Florence Kluckhohn and Fred Strodtbeck’s 
pathbreaking 1961 book, Variations in Value 
Orientation. For me, collectivism was a world 
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view in whichyskaeieakichontet 
both to each other and to the physical world 
than in the individualistic worldview. The 
terminology was not perfect and continues 
to be problematic (e.g., Oyserman, Coon, & 
Kemmelmeier, 2002). The important point 
for me, however, was that a worldview 
and a value system had significant cognitive 
implications. 

The intrinsic connectedness of the phys- 
ical and social worlds for our unschooled 
Wolof participants was substantiated by the 
distinctive causal reasoning of unschooled 
children who had not yet attained conser 
vation. Children who believed the quantity 
of water had changed after the experimenter 
transfered it to a taller, thinner beaker (or di- 
vided it into several smaller beakers) would 
often say that the amount had changed be- 
cause “you poured it.” This justification con- 
trasted with the more usual perceptual rea- 
sons I had seen in the United States — for 
example, the amount has changed because 
“the water is higher.” At first, I thought that 
“a natural phenomenon was being explained 
by attributing special, magical powers to 
intervening human agents” (Greenfield & 
Bruner, 1966/69). But then we realized this 
was an ethnocentric interpretation. We drew 
upon Kohler (1937/1962), who points out 
that such phenomena are made possible by 
a worldview, 


in which animate and inanimate phenom- 
ena occupy a single plane of reality. That is, 
the child in the conservation experiment is 
faced with the following sequence of events: 
(1) water a certain way, (2) experimenter’s 
action, (3) water changed. When the child 
says the amount is not the same because 
the experimenter poured it, he is basing 
his causal inference on contiguity — the 
usual procedure even in our society. But un- 
der ordinary circumstances, we would ac- 
cept an explanation in terms of contiguous 
physical events or contiguous social events, 
but not a causal chain that included both 
kinds of event. Thus “magic” only exists 
from the perspective of a dualistic ontology. 

(Greenfield & Bruner, 1969, p. 639). 


The presence of a school in the bush vil- 
lage where I worked, Taiba N’Diaye, made 


SURAT NArY Roesible a natural experiment. Some chil- 


dren went to school; others, even from the 
same families, did not. There was no selec- 
tion for school attendance on the basis of in- 
telligence. We therefore could see what dif- 
ference school made. Indeed, it suppressed 
the action reasons for inequality judgments 
with what we called at the time “astonish- 
ing absoluteness”; there was not one instance 
among all the school children, either in the 
village or in the capital city of Dakar (Green- 
field & Bruner, 1966/1969). This was my sec- 
ond hint that school functions to create an 
individualistic psychology. One route to this 
effect might be that, in school, one is always 
being asked to give reasons for things. At 
the time, however, my best candidate was 
literacy, introduced into the oral Wolof cul- 
ture by the school, of French colonial ori- 
gin. In the written word, a thought clearly 
has a separate physical manifestation from 
its referents in the real world; this could be 
the beginning of understanding self as sep- 
arate from world and thought as separate 
from its referent (Greenfield, 1972/1975). 
But the finding also shows that worldviews 
are not immutable; they are constructed 
by experience. 

Finally, a learning experiment helped us 
analyze further the thought processes of the 
unschooled children. We devised a proce- 
dure in which the child, rather than the ex- 
perimenter, first transfered the water from 
one beaker to a taller, thinner one, then to six 
tiny ones. We thought that the child might 
be willing to attribute powers to an author- 
ity figure that he was was not willing to at- 
tribute to himself. Indeed, at all ages (from 
six to thirteen), conservation performance 
was much better when the child poured than 
when the experimenter poured, and there 
was good transfer of the conservation judg- 
ment to posttests in which the experimenter 
did the pouring (Greenfield, 1966). We con- 
cluded that the experimenter as authority 
figure was considered to have causal power 
to change the amount of water. Once the 
child had a chance to “do-it-himself or her- 
self” the powers of the experimenter were 
somehow diminished. Only recently have 
I come to realize that the action reason 
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social authority, is also part of the collectivis- 
tic worldview. 

We connected these patterns of thought 
to early Wolof socialization on the one hand 
and to African philosophy on the other. First 
we reasoned as follows: 


It may be that a collective, rather than in- 
dividual, value orientation develops where 
the individual lacks power over the physi- 
cal world. Lacking personal power, he has 
no notion of personal importance. In terms 
of his cognitive categories, now, he will be 
less likely to set himself apart from others 
and the physical world, he will be less self- 
conscious at the same time that he places 
less value on himself, Thus, mastery over 
the physical world and individualistic self- 
consciousness will appear together in a cul- 
ture, in contrast to a collective orientation 
and a... world view in which people's at- 
titudes and actions are not placed in sep- 
arate conceptual pigeonholes from physi- 
cal events. (Greenfield & Bruner, 1969, 


p. 640). 


Indeed, I had noted that the unschooled 
Wolof children had never spontaneously ma- 
nipulated the materials in the conserva- 
tion experiment. I saw this as indicative of 
the absence of a sense of power over the 
physical world. 


The Importance of Ethnography 


Was there a developmental reason in early 
socialization for this dichotomy between in- 
dividual mastery over the physical world and 
a collectivistic value orientation? I turned to 
the anthropological method of ethnography 
to find out. Ethnography is often defined in 
anthropology as participant observation; in 
the course of developing an appropriate par- 
ticipant role or roles in a real-life cultural 
setting, the researcher is able to record, tra- 
ditionally by means of in-depth field notes, 
everyday life and discourse relevant to a par- 
ticular topic or multiple topics. 

My colleague and friend in Senegal, 
Jacqueline Rabain, working on an ethno- 
graphic dissertation for the Sorbonne, found 
some clues to early socialization in the every- 


found clues, for example, in adult interpreta- 
tions of the child’s developing motor capac- 
ities. Whereas we, in the United States or 
France, would get excited about the child’s 
first step as an index of developing skill and 
even independence, a Wolof mother would 
likely interpret it as signifying the child’s de- 
sire in relation to a person in his surrounding; 
for example, she might say something like 
“Look, he’s walking toward you!” (Rabain- 
Zempléni, 1965). 


Thus, adult interpretation of the child’s first 
actions would seem to be paradigmatic for 
the choice between an individualistic and 
a collective orientation; a social interpre- 
tation of an act not only relates the actor 
to the group but also relates the group, in- 
cluding the actor, to physical events. When 
on the other hand, acts are given an inter- 
pretation in terms of motoric competence, 
other people are irrelevan and, moreover, 
the act is separated from the motivations, 
interntions, and desires of the actor himself. 

(Greenfield & Bruner, 1969, p. 641) 


Such selective interpretations serve an im- 
portant socializing function: They expose 
the child to what is considered important in 
a particular culture. 

Rabain also found the first clues that col- 
lectivism was associated with de-emphasis 
of the world of objects. She noted that ma- 
nipulation of objects was an occasional and 
secondary activity for the Wolof child from 
two to four years and that self-image rested 
more on power over people than power 
over objects. She noted further that verbal 
exchanges between adults and children of- 
ten concerned valued relations between peo- 
ple but rarely concerned explanations of the 
physical world (Rabain-Zempléni, 1965). 
Because scientific thinking is so linked to the 
world of objects, this was a clue that col- 
lectivistic world view might privilege social 
thinking, thinking about people and their 
relations, over scientific thinking. Later re- 
search has confirmed this paradigm of early 
socialization for a world that emphasizes 
thinking about people rather than things 
(Greenfield et al., 2003). It contrasts greatly 
with a paradigm that emphasizes learning to 
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form of toys, from early infancy on (Green- 
field et al., 2003). 

Most intriguing, because it related di- 
rectly to my conservation experiment, was 
Rabain’s observation that, in the everyday 
situation of sharing a quantity among several 
persons (a situation not too different from 
the second half of my conservation experi- 
ment, in which a quantity of water was di- 
vided among six breakers), Wolof bush chil- 
dren pay more attention to who receives 
what, when, than to the amount received 
(Rabain-Zempléni, 1965). It parallels the 
“magical” action reason for nonconservation: 
More attention is focused on the person 
pouring, the social aspect of the situation, 
than on the purely physical aspect, the 
amount of water. This observation could 
also explain why Wolof children in Senegal 
achieved conservation in the standard ex- 
periment later than children in the United 
States or Switzerland. 

This work illustrates the way in which 
ethnography can complement experiments 
to deepen understanding of paradigms of 
cultural thought. Ethnography has a very 
special role to play because it introduces 
cultural interpretations of behavior — it re- 
veals that the very same behavior can have 
an opposite meaning in two different cul- 
tural settings. In a sense, when we do ex- 
periments in the United States, we already 
have done our ethnography. Because we are 
members of the society, we have a good idea 
of the cultural meaning of our results. This 
is not the case when we study a culture dif- 
ferent from our own. Ethnography also con- 
nects our findings in the laboratory to the 
real world phenomena of everyday life. Fi- 
nally, because cultural values are implicit in 
the very design of our experiments, often 
without our realizing it, ethnography is re- 
quired to design valid cross-cultural experi- 
ments. We omit this first ethnographic stage 
of cross-cultural research at our own peril, 
as the reader will see later in this chapter. 


The Level of Social Ideology 


Rabain’s ethnography did not uncover only 
socialization antecedents to the thinking 


fascinating were parallels on the broader cul- 
tural level of social ideology. Aimée Césaire 
had developed a concept of négritude or 
blackness, a worldview that distinguished 
Black values from White. In opposition 
to the individualism of European cultures, 
négritude emphasizes “solidarity, born of the 
cohesion of the...clan” (Kestelhoof, 1962). 
The poet and president of Senegal, Leopold 
Senghor, defined négritude as “participation 
of the subject in the object, participation 
of the man in cosmic forces, communion 
of man with all other men” (Monteil, 1964, 
p. 31, my translation). This formulation of 
social and cultural ideology looked like my 
experimental results in Senegal writ large! 

It was therefore not surprising that 
cultural world view also permeated the 
second cognitive domain of my dissertation 
research in Senegal, the development of cat- 
egorization. If unschooled Wolof children 
were assuming that the world exists on one 
plane, with thought and object of thought 
as one unified reality, then it followed 
that the notions of individual viewpoints 
and different points of view would also 
be meaningless. Data from a study of 
picture categorization were relevant to this 
implication (Greenfield, Reich, & Olver, 
1966/1972). Children of different ages were 
given triads of pictures and asked to pick the 
pair that was most alike. After unschooled 
Wolof children had selected a pair, the 
pictures were replaced and the participants 
were asked to find two different pictures 
from the same set that were also alike. In 
fact, each set of three images had been 
designed to have three bases of similarity — 
form, function, and color. But unschooled 
Wolof children did not find a second basis 
of similarity; they saw the stimuli from only 
one point of view. Researchers working in 
other parts of the nonindustrial world found 
parallel results (Cole et al., 1971; Irwin & 
McLaughlin, 1970). Thus, categorization 
behavior also revealed indications of taking 
for granted a single perspective on the 
world. (See Goldstone & Son, Chap. 2, for a 
review of theories of similarity; and Medin 
& Rips, Chap. 3, for a review of studies of 
categorization.) 
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Thought, Cole and Scribner noted the need 
for integrative theory “to pull together a va- 
riety of disconnected experiments” (Cole & 
Scribner, 1974, p. 172). I did not realize 
that the two paradigms of thought I had 
stumbled upon in Senegal formed the ba- 
sis of such an integrative theory. Data on 
culture and thought that could later be in- 
serted into this larger framework contin- 
ued to accumulate. Like my problem in 
developing questions that were meaningful 
to elicit reasoning in a conservation exper 
iment, many of the findings were initially 
seen as methodological barriers to be over- 
come rather than as deep cultural differences 
in cognitive functioning. 

Let me give an example from Cole et al. 
(1971). These researchers took a categoriza- 
tion task to Liberia, where they presented 
it to their Kpelle participants. This task in- 
volved a set of 20 objects that divided evenly 
into the linguistic categories of foods, imple- 
ments, food containers, and clothing. When 
asked to group objects that were similar, the 
Kpelle participants did not do the taxonomic 
sorts expected by the researchers. Instead 
participants consistently made functional 
pairings (Glick, 1968). For example, rather 
than sorting objects into groups of tools and 
foods, participants would put a potato and 
a knife together because “you take the knife 
and cut the potato” (Cole et al., 1971, p. 79). 
According to Glick, participants often jus- 
tified their pairings by stating “that a wise 
man could only do such and such” (Glick, 
1968, p. 13). In total exasperation, the re- 
searchers “finally said, ‘How would a fool do 
it?’ The result was a set of nice linguistically 
ordered categories — four of them with five 
items each” (Glick, 1968, p. 13). 

From the methodological perspective of 
a cognitive psychologist, the researchers had 
failed to tap into the participants’ obvious 
competence in categorization with their first 
procedure. This example illustrates what 
Cole and Scribner (1974) viewed as two 
general problems in the cross-cultural study 
of thought: 


1. There is a great readiness to assume that 
particular kinds of tests or experimental sit- 


cognitive capacities or processes. 

2. Psychological processes are treated as “en- 
tities” which a person “has” or “does not 
have.” In other words, they are considered 
a property of the person rather than the sit- 
uation. 

(Cole & Scribner, 1974, p. 173). 


There is another problem in this story that 
also can be considered methodological — 
the ethnocentrism of the criteria for “cor 
rect” sorting. Such methodological prob- 
lems led Cole and Scribner (1974) to rec- 
ommend that researchers take into account 
“knowledge about the culture and behay- 
ior of the people gained from the work of 
anthropologists, linguists, and other social 
scientists.” (Ref. 8, p. 196). They went a 
step “further in suggesting that the meth- 
ods of these relevant fields need to be in- 
tegrated. ... Field and laboratory, anthropo- 
logical observation and psychological ex- 
perimentation, can yield knowledge from 
different perspectives about the same func- 
tion” (Ref. 8, p. 196). We already have seen 
this advice in action; collection of both qual- 
itative and quantitative data is part of the 
methodological armoire of the cultural psy- 
chologist (Greenfield, 1997a). 

But the problems of “wise” and “fool- 
ish” sorting also get to the substantive heart 
of the collectivistic paradigm of cognition. 
From the vantage point of a collectivis- 
tic worldview, I would submit that the 
“wise man’s” pairings were of social util- 
ity, whereas the “foolish man’s” taxonomic 
groupings of five items each were socially or 
pragmatically useless. I believe that is why, 
for the Kpelle, a wise man would make func- 
tional pairs, whereas only a fool would make 
taxonomic sorts. 

This analysis leads us to an even deeper 
level of cultural definitions of intelligence: 
In the Kpelle example, the researchers’ 
criterion for intelligent behavior was the 
participants’ criterion for foolish; the partic- 
ipants’ criterion for wise behavior was the 
researchers’ criterion for stupid (Greenfield, 
1997b). Underlying these interpretations of 
the experiment are different ethnotheories, 
that is, folk theories of intelligence. Most 


668 THE CAMBRIDGE HANDBOOK OF THINKING AND REASONING 


profoundly, Badtertated ye hither iREHehary ego rations of intelligence concepts in dif- 


thought is worth studying are very much 
influenced by our ethnotheories of what 
constitutes intelligent behavior. And what 
constitutes intelligent behavior depends on 
what is adaptive and valued in a particular 
ecocultural environment. The investigation 
of ethnotheories of intelligence proved to 
greatly deepen understanding of cultural 
paradigms of thought (see Sternberg, Chap. 
31, for further discussion of intelligence). 


Theories and Ethnotheories 
of Intelligence 


Clearly, human intelligence and the brain 
structure that supports it are keys to our 
adaptation as a species. Yet within this broad 
rubric of human intelligence, different forms 
of intelligence are valued and adaptive in 
different ecocultural niches. Mundy-Castle 
(1974/1976) contrasted technological intel- 
ligence, which is more developed in the 
independent, individualist characteristic of 
Europe, and social intelligence, which is 
more developed in the interdependent, col- 
lectivist characteristic of Africa. Closely re- 
lated to technological intelligence (and per- 
haps indistinguishable from it) is scientific 
intelligence. Indeed, underlying Piaget’s the- 
ory of cognitive development is a theory 
of intelligence as scientific thinking (Green- 
field, 1974). By his own admission, un- 
derstanding the basis for Western scientific 
thought was Piaget’s most fundamental the- 
oretical concern (Piaget, 1963/1977). Under 
Inhelder’s leadership, Piaget investigated the 
development of scientific thought (chem- 
istry and physics) in a set of experimental 
studies (Inhelder & Piaget, 1958). This body 
of theory and research implies the impor 
tance of scientific intelligence as a develop- 
mental goal for processes of thinking. Sci- 
entific or technological intelligence as a folk 
theory supports thinking skills that relate to 
the world of things rather than people; this 
would include most of the items and subtests 
of standardized intelligence tests. 

Following Mundy-Castle’s depiction of 
technological and social intelligence, related 


ferent cultures began to appear (Dasen & de 
Ribeaupierre, 1987; Serpell, 1994; Sternberg 
et al., 1981; Wober, 1974); all challenged the 
assumption that technological or scientific 
intelligence was a universal endpoint of de- 
velopment (Greenfield, 1974). Indeed, so- 
cial intelligence turned out to be the pre- 
dominant ideal in Africa and Asia (eg., 
Wober, 1974; Super, 1983; Dasen, 1984; Gill 
& Keats, 1980; Serpell, 1994; Nsamenang & 
Lamb, 1994; Grigorenko et al., 2001). Intel- 
ligence in all these investigations includes a 
concern with responsible ways of contribut- 
ing to the social world. The central feature 
of the Baoulé concept of intelligence in Ivory 
Coast, West Africa, for example, is willing- 
ness to help others (Dasen 1984). In gen- 
eral, African cultures not only emphasize 
social intelligence but also see the role of 
technical skills as a means to social ends 
(Dasen i984). This sort of ethnotheory of in- 
telligence could explain why the taxonomic 
sorter was a foolish man in Kpelle eyes. 

As a group, such conceptions can be seen 
as collectivistic conceptions of intelligence 
(Segall et al., 1999). Note that these con- 
ceptions are not all-or-none. Differences to 
a great extent, are a matter of differen- 
tial priorities. At the same time, there is 
not one collectivistic conception of intelli- 
gence, nor a single individualistic concep- 
tion of intelligence. There are cross-cultural 
surface variations for each underlying theme 


(Greenfield, 2000). 


Who and What Are the Individualists 
and Collectivists? 


This is perhaps the place to stop and de- 
fine who are the individualists and who 
are the collectivists. In doing so, I will not 
present a simple picture. Instead, I will dis- 
cuss ideal cases, in-between cases, culture 
change, biculturalism, and culture contact. 
These complexities take me beyond simple 
binary distinctions that have bothered some 
(Rogoff, 2003). 

My nonbinary starting point is that all hu- 
man beings are both individual and social. 
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try to maximize one or the other facet of 
the human experience. Correlated with this 
maximization are different forms that the 
social and the individual take within each 
paradigm. So, for example, social behavior 
tends to be more automatic in the collec- 
tivistic system and more by choice, providing 
individual autonomy, in the individualistic 
system. The other side of the maximization 
coin is the fact that the major mode of one 
cultural paradigm may be the minor mode 
of the other. For example, in the society of 
the United States, we might see religions as 
often emphasizing the communitarian in a 
primarily individualistic surround. The uni- 
versal existence of both modes can be seen 
in priming studies in which the minor mode 
(individualism in the case of Asians, collec- 
tivism in the case of North Americans) can 
be elicited by a relevant prime (Gardner, 
Gabriel, & Lee, 1999). 

It is also important to realize that we are 
talking about cultural systems, not isolated 
attributes (cf. Kitayama, 2002). The distri- 
bution of autonomy and obedience between 
men and women in a collectivistic culture 
has been used as an argument against the 
very concept of collectivistic culture and for 
the notion that autonomy and obedience are 
individual difference variables rather than 
culture-level characteristics (Turiel, 2000). 
In response to this argument, I note that 
one essence of a collectivistic culture is re- 
lations of obedience between women and 
men, clearly providing more autonomy for 
men than for women. Similarly, the rela- 
tion of equality among individuals provides 
more autonomy for both women and men 
in an individualistic culture. It is not the 
existence of autonomy that is important in 
the characterization of a culture according 
to the present paradigm; it is the pattern- 
ing that counts. Indeed, I would see the em- 
phasis on individuals as separate rather than 
as interrelated (the hallmark of psychology 
founded upon the independent individual 
as the unit of analysis) as an individualis- 
tic perspective on social science itself. Cul- 
ture as a system of relations, the patterning 
of attributes, the forms of individual and so- 
cial behavior, and the system of priorities — 


paradigm. 

Who are the collectivists? Harry Trian- 
dis notes that they include 70% of the 
world’s population — the populations of 
Africa, Asia, Latin America, and Native 
America (Triandis, 1989). Equally impor- 
tant, there are demographic, ecological, and 
historical factors that are inputs into the ex- 
pressed value system. Some of the most im- 
portant demographic factors are economic 
level [rich are more individualistic than poor 
(Segall et al., 1999)], the urban-rural con- 
trast [large-scale urban more individualistic 
than small-scale rural (Kim & Choi, 1994)], 
formal education [which functions as an in- 
dividualizer (Reykowski, 1994)], high tech- 
nology [which functions as an individu- 
alizer (Mundy-Castle, 1974)], immigration 
and migration (making people more indi- 
vidualistic), agricultural subsistence versus 
commerce [the latter functioning as an indi- 
vidualizer (Greenfield, Maynard, & Childs, 
2003; Greenfield, 2004)], and religion (some 
are more individualistic; e.g., Protestantism, 
others more collectivistic; e.g., Catholicism). 

Indeed, it is useful to see the two 
paradigms as originating as adaptations to 
different ecologies. Demographic factors in- 
fluence ecology and, through ecology, they 
form psychologies. Thus, rich people do not 
need to cooperate with a larger group for 
their survival; poor people do. The urban en- 
vironment contains many strangers, and so 
community relations become less functional 
(Kim, 1994). In formal education, the irre- 
ducible unit of performance is the individual 
who must receive an individual grade and 
performance evaluation (Greenfield, 1994). 
Complex technology functions as an individ- 
ualizer in multiple ways — through providing 
large dwellings and office buildings with the 
opportunity for private space and through 
substituting interaction with a machine for 
interaction with people (e.g., television re- 
placing frequent face-to-face visits). 

When you immigrate to a new country or 
migrate to a new location within a country, 
you often leave extended family behind. As 
a consequence, a high rate of geographical 
mobility should increase individualism. This 
might be a reason why Europeans are less 
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that nation states composed primarily of im- 
migrants at their founding — for example, 
the United States, Canada, and Australia — 
are generally among the most individualistic 
(Hofstede, 1980; Oyserman, Coon, & Kem- 
melmeier, 2002). 

In subsistence agriculture settings, all 
must cooperate to produce mainly perish- 
able goods. In a commercial setting, it is de- 
sirable to maximize the monetary resources 
of an individual to accumulate nonperish- 
able consumer goods like cars or televi- 
sions (Collier, 2003). Catholicism empha- 
sizes the communal, including a pathway 
to God through another human being, the 
priest; Protestantism emphasizes the inde- 
pendent individual with a direct pathway to 
God. It is interesting that, as commerce de- 
velops in Mexico and Central America and 
when immigrants come to the commercial 
environment of the United States from the 
more agricultural environment of Mexico, 
evangelical Protestantism has become much 
more popular whereas Catholicism has de- 
clined in popularity. 

It is also important to note that, because 
of all these factors, individualism and col- 
lectivism are relative terms, their system- 
atic nature notwithstanding. If one tests 
rural versus urban populations in the same 
country (e.g., Mexico), one will usually find 
the rural population to be more collectivistic 
(e.g., Madsen & Shapira, 1970). On the other 
hand, if you compare Latino immigrant fam- 
ilies in Los Angeles, an urban setting, and 
Euro-American families in Los Angeles, the 
urban Latino families will respond more 
collectivistically than the Euro-Americans 
(Raeff, Greenfield, & Quiroz, 2000). In 
other words, the nature of these demo- 
graphic variables is such as to make individ- 
ualism and collectivism graded, rather than 
all-or-none systems. Because they are so cen- 
tral to adaptation, they are clearly very sen- 
sitive to environmental factors. 

Multiple demographic factors create 
paradigmatic cases on the extremes (H. 
Keller, personal communication, June, 
2003): The small, stable, poor, agrarian 
village with an oral culture and without 


matic case on the collectivistic end of the 
spectrum. The large, mobile, rich, urban 
neighborhood with a high level of formal 
schooling and advanced technology would 
be the paradigmatic case on the individual- 
istic end of the spectrum. Clearly, all other 
cases would fall between these extremes. 

A particular type of in-between case is the 
immigrant family who has come, most gen- 
erally, from a poorer, more collectivistic soci- 
ety into a richer, more individualistic one. In 
general, such immigrants will be at a point 
between their compatriots in the ancestral 
country and natives of the host country on 
cognitive tasks that tap into individualis- 
tic and collectivistic paradigms of thought 
(Nisbett, 2003). In addition, we expect, as 
generations in the host country increase, the 
host country culture will make an increas- 
ingly large mark on patterns of thought. 

Because of the development of the world 
in the direction of a dense urban, com- 
mercial, high-tech environment, there is a 
worldwide movement toward increasing in- 
dividualism. Finally, because of high rates 
of immigration, there is also increasing con- 
tact between more individualistic and more 
collectivistic cultures in the world. This of- 
ten leads to mismatches and misunderstand- 
ings. I will give an example of a cognitive 
mismatch and misunderstanding later in the 
chapter. But let me now turn to some ad- 
ditional thought processes in which the two 
paradigms manifest themselves, yielding in- 
teresting cross-cultural differences. 


Thinking about People: Theory 
of Mind 


Given what I had observed in Senegal con- 
cerning the absence of a notion of point-of- 
view, I became very skeptical when theory of 
mind became popular in cognitive develop- 
ment research. The claims for universality 
of the sort of calculus that requires a par- 
ticipant to know, for example, what some- 
one knows a third party has said to a fourth 
party (e.g., Does Mary know the ice-cream 
man has talked to John?”, Baron-Cohen, 
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entiation of viewpoints for children whose 
world view emphasized unity with the world 
and those around them. I wanted to think 
through the individualistic assumptions that 
might be being made in this line of re- 
search and to think about what a col- 
lectivistic alternative might look like. This 
search eventuated in one section of an ar 
ticle, “Cultural Pathways through Univer 
sal Development” (Greenfield, et al., 2003), 
which I present here. 

Understanding self and others is part 
of our universal evolutionary heritage 
(Tomasello, 1999; Whiten, 2002). The mir- 
ror neuron system of the cerebral cortex 
reveals a common neuromuscular activa- 
tion for acting oneself and for understand- 
ing the actions of others (Fadiga et al., 
1995; Iacoboni et al., 1999). In ontogeny, 
the first step in understanding self and oth- 
ers occurs at birth, when infants discrimi- 
nate people from things (Trevarthen, 1980). 
Comprehension of agency as the produc- 
tion of goal-directed action begins in early 
infancy (Gelman & Lucariello, 2002). An 
ability to distinguish between self and oth- 
ers as intentional agents develops at eight or 
nine months of age (Piaget, 1952; Tomasello, 
1999; Trevarthen, 1980). 

At the one-word stage of language devel- 
opment (between one and two years of age), 
infants code the intentional action not just of 
self but of others (Greenfield & Smith, 1976; 
Greenfield, 1980), and this encoding seems 
to have ancient phylogenetic roots (Green- 
field & Savage-Rumbaugh, 1990; Greenfield 
& Lyn, in press). The linguistic encoding of 
intentional action becomes more complex 
with age and the acquisition of language 
(Bloom, Lightbown, & Hood, 1975). At the 
same time, there is very early understand- 
ing of the effects of action on other people. 
Script knowledge, which begins in the sec- 
ond year of life, involves the understanding 
of both intentions and effects of human ac- 
tion (Gelman & Lucariello, 2002). It also re- 
quires an understanding of the coordination 
of action by more than one person. 

These two universal capacities — the ca- 
pacity to encode the intentions of self and 


cial effects of one’s own and others’ action — 
provide the groundwork for two distinct 
cultural emphases in the development of 
person knowledge. Some cultures empha- 
size the individual psyche, individual traits, 
and the individual intentions behind action 
(Vinden & Astington, 2000); other cultures 
emphasize the social effects and social con- 
text of a person’s action (Duranti, 1988, 
1993; Shweder & Bourne, 1984; Fiske et al., 
1998). The latter also see mind and heart 
as integrated rather than separate (Lillard, 
1998: Zambrano, 1999). We see the former 
as the individualistic emphasis and the latter 
as the collectivistic or sociocentric emphasis. 

Most literature on theory of mind — the 
ability to think about other people’s mental 
states — has assumed an emphasis on indi- 
vidual minds (Flavell, 1999). I, however, see 
theory of mind as a special culturally canal- 
ized case of person knowledge (cf Hobson, 
1993). I therefore review the literature in- 
dicating the existence of these two differ- 
ent cultural emphases — individual psyche 
versus social effects or context — in the de- 
velopment of social understanding or person 
knowledge. 

Although it claims universality, I utilize 
the classical literature on theory of mind to 
complete the picture of the individualistic 
pathway to person knowledge. Early steps 
along this pathway have to do with the ac- 
quisition of mentalistic terms; children as 
young as twenty-two months first produce 
mentalistic terms such as know and pretend 
(Wellman, 1990). Later, the child is able to 
imagine a mental state of affairs in another 
person different from the information avail- 
able to oneself (e.g., Perner, 1991). Similar 
trends occur in literate, developed countries, 
both Western and non-Western (Wellman, 
Cross, & Watson, 2001). The differentia- 
tion and individuation of people according 
to their states of mind is basic to this devel- 
opmental pathway to social understanding. 

In the other pathway, however, mental- 
istic terms are lacking in the lexicon, are 
not understood in the same way as the En- 
glish equivalents, and are not applied to one- 
self. This phenomenon has been found in a 
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& Bruner, 1966/1969; Vinden, 1996, 1999). 
As mentioned earlier, however, both school- 
ing, with its demand for justifications, and 
literacy, with its separation of thought (on 
paper) from thinker, leads to an understand- 
ing of the mentalistic term think (Greenfield 
& Bruner, 1966/1969). (See Lillard (1998) 
for a cross-cultural review of the theory-of- 
mind literature). 

In a nonliterate subsistence ecology in 
Africa, children between two and four years 
old were given a theory-of-mind task em- 
bedded into a context of social action. In ad- 
dition, the task used the term heart rather 
than thought (Avis & Harris, 1991). Un- 
der these circumstances, Baka children in 
southeast Cameroon showed the develop- 
ment of social understanding that had been 
found in the United States and Europe. 
The results contrasted strongly with another 
study that (1) decontextualized the task, 
presenting it as a task involving only one 
actual person, the subject; (2) asked about 
the deceived’s thought rather than action 
in reference to a hidden object; and (3) 
asked about mind rather than heart. Under 
these conditions, Quechua children between 
about four and eight performed at chance 
levels (Vinden, 1996). Somewhat more 
contextualized tasks led to somewhat im- 
proved performance in subsistence groups 
in Cameroon, West Africa (the Mofu), and 
Papua New Guinea (the Tainae and Tolai) 
(Vinden, 1999). 

Meta-analysis indicates that, around the 
world, children from subsistence cultures 
solve theory-of-mind tasks better when 
these are presented in context (Wellman, 
Cross, & Watson, 2001). However, Vinden 
(1999) found a lag in age in all groups relative 
to children of European-derived cultures; 
false belief (the understanding that another 
person has been misled into believing that 
something is true that, in fact, is false) as- 
sessed using the word “think” was at chance 
levels at all ages in the two groups most iso- 
lated from the outside world of European 
culture. 

Here we interpret a lag as indicating 
that the skill in question is not valued in 


a collectivist or group orientation, personal, 
mental, and emotional states are relatively 
unimportant” (Vinden & Astington, 2000, 
p. 512). In line with the notion that school 
ecology favors the development of atten- 
tion to the individual psyche, schooled chil- 
dren performed better on several of the 
tasks relating to predicting an individual’s 
behavior or emotion in a nonsocial situation 
(Vinden, 1999). 

On the other hand, in a culturally impor- 
tant situation involving social responsibil- 
ity, young children from small, face-to-face 
societies with subsistence traditions show 
advanced understanding of the knowledge 
state and feelings of another person whose 
knowledge differs from one’s own. In a suc- 
cessful apprenticeship situation, the expert 
must be aware of how much less the novice 
knows in comparison with self. The expert 
must also be aware of the novice’s need for 
materials and the novice’s motivations. In a 
video study of naturalistic teaching interac- 
tions, Zinacantec Maya children as young as 
four years old were able to supply necessary 
materials and model tasks for their younger 
siblings (Maynard, 2002). They were also 
able to provide useful verbal guidance in 
teaching, such as narrating a task they were 
demonstrating and giving commands to the 
younger child. By the age of eight, chil- 
dren were very adept at simplifying the 
task for younger children by giving them 
parts of tasks, one at a time, and at scaf- 
folding the task by providing complex ver- 
bal information. These advanced thinking 
skills showed an understanding of the knowl- 
edge state, material needs, and motivation of 
the younger children. Sibling caregiving as 
an important social responsibility may have 
played a role in the young children’s de- 
sire and skill in teaching their younger sib- 
lings. Similar sibling teaching practices were 
found in another sibling-caregiving culture — 
the Wolof of Senegal (Rabain-Jamin, May- 
nard, & Greenfield, 2003). Future research is 
needed to explore the relationship between 
the cognitive operations of person knowl- 
edge in sibling caregiving and in experimen- 
tal tasks. 
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that person knowledge has been measured 
so frequently by false belief, the dominant 
theory-of-mind task. In a false-belief task, 
the participant must understand that an- 
other person has a different perspective (the 
false belief) from his or her own. It is a 
task that requires individuation of one’s per- 
spective from that of another. Individuation 
is an important component of the devel- 
opment of the independent self. It may be 
that socialization in interdependent cultures 
emphasizes shared perspectives more than 
different perspectives. Only future research 
can tell us whether this may be another 
reason for relatively poor performance on 
false-belief tasks in collectivistic, subsistence 
cultures. 

Ideally, cross-cultural comparison would 
involve a developmental analysis of tasks 
tapping into both of these cultural emphases 
within the context of universal develop- 
ments. A pioneering study of social explana- 
tion in India and the United States by Joan 
Miller (1984) did exactly that: Children in 
both the United States and India improved 
at social explanation with age (the universal 
development). At the same time, children 
in the United States increasingly formulated 
their social explanations of events in terms 
of an individual’s stable traits (emphasis on 
the individual psyche). Indian children, in 
contrast, increasingly formulated their social 
explanations in terms of contextual factors, 
particularly factors in the social surround 
(emphasis on social context). 

Miller’s findings were replicated in a real 
world situation by Morris and Peng (1994). 
They found that when a Chinese physics 
student at the University of Iowa shot his 
advisor and several other people after los- 
ing an award competition, the reasons given 
were quite different in U.S. and Chinese 
newspapers: 


Michael Morris, a graduate student at 
Michigan at the time, noticed that the ex- 
planations for Gang Lu’s behavior in the 
campus newpapers focused almost entirely 
on Lu’s presumed qualities — the mur- 
derer’s psychological foibles (“very bad tem- 
per,” “sinister edge to his character”), atti- 


important means to redress grievances”), 
and psychological problems (“a darkly dis- 
turbed man who drove himself to success 
and destruction,” “a psychological problem 
with being challenged”). He asked his fel- 
low student Kaiping Peng what kinds of ac- 
counts of the murder were being given in 
Chinese newspapers. They could scarcely 
have been more different. Chinese reporters 
emphasized causes that had to do with the 
context in which Lu operated. Explana- 
tions centered on Lu’s relationships (“did 
not get along with his advisor,” “rivalry 
with slain student,” “isolation from Chi- 
nese community”), pressures in Chinese so- 
ciety (“victim of Chinese ‘Top Student’ edu- 
cational policy”) and aspects of the Amer- 
ican context (“availability of guns in the 
US.”). (Morris & Peng, pp. 111-112). 


Morris and Peng found the same pattern of 
differences when the incident involved a stu- 
dent from the United States. The Chinese 
focused on the killer’s relation to context, 
particularly social context, in explaining his 
behavior. U.S. reporters focused on quali- 
ties of the individual. A whole series of ex- 
periments on causal attribution led to the 
conclusion that “Westerners attend primar- 
ily to the focal... person and Asians attend 
more broadly to the field and to the relations 
between the object and the field” (Nisbett, 
2003, p. 127). Thus, a pattern of cultural 
differnces found in the developing child by 
Miller also show up in adulthood, the end- 
point or outcome of development. 

Hong Kong is a setting in which two cul- 
tures, one more collectivistic (Chinese) and 
one more individualistic (British) coexist. 
Hong et al. (2000) showed the dynamism 
of the bicultural mind in the arena of social 
explanation. When primed with symbols of 
Western culture (e.g., Mickey Mouse) in 
an experiment concerning social explanation 
(participants had to explain why, in a pic- 
ture, one fish was swimming in front of the 
other fish), participants constructed more 
explanations in terms of individual motiva- 
tion. When primed with symbols of Chinese 
culture (e.g., with a dragon), participants 
constructed more explanations in terms of 
the other fish or the context. 
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people can affect a sense of one’s own con- 
tinuity of self over time. Parallel to the 
two modes of social explanation discov- 
ered by Miller, researchers Lalonde, Chan- 
dler, and Sokol (1999) identify two cultural 
modes of addressing the problem of self- 
continuity over time in autobiographical nar- 
ratives. This is the problem of how to expe- 
rience and conceptualize a continuing self 
in the presence of dramatic changes over 
the course of development. They term the 
first model “an ‘Essentialist’ or ‘Entity’ no- 
tion of selfhood” (Ref. 55, 1999, p. 1); these 
narratives focus attention upon some aspect 
of the self “that is thought to remain un- 
touched by time and change” (Lalonde, Ref. 
55,1999, p. 1). The pathway of the inde- 
pendent, autonomous self requires a source 
of self-continuity that is functional in the 
face of separation from parents, the modal 
adolescent identity formation in the United 
States and Canada. Internal essences or en- 
tities would fill this requirement; this is 
the way in which most non-Native Canadi- 
ans explain self-continuity (Chandler et al., 
2003). And, as we have seen from Miller’s 
research, internal traits or essences are gen- 
erally used in causal attribution in the indi- 
vidualistic paradigm. 

They call the second model a 
“relationship-centered” notion of self. 
It uses narrative to connect the self across 
different time periods. The narratives often 
situate the speakers in family and com- 
munity relationships that continue across 
various periods in the life cycle. This is 
the way most Native Canadians explain 


self-continuity (Lalonde, Chandler, & Sokol, 
1999). 


Thinking about Things: 
Categories, Physical Relations, 
and Social Relations 


A more collectivistic ethnotheory of intel- 
ligence that values relationships and social 
utility can explain why the wise Kpelle per- 
son would make functional pairs in a cat- 


nomic categories. Taxonomic categories, in 
contrast, revolve around a defining trait or 
traits of its members. These defining traits 
are decontextualized from the social util- 
ity of the object or from other parts of the 
physical world. We saw this same contrast 
between an emphasis on inner traits that 
transcend context and contextualized expla- 
nation when we examined two paradigms of 
social reasoning (Miller, 1984). 

If the Kpelle mode of categorization typi- 
fies a collectivistic worldview, then it should 
appear in other collectivistic cultures. In- 
deed, this is the case. Ji, Zhang, and Nis- 
bett (2002) compared U.S. college students 
with students from China and Taiwan on a 
triadic test of categorization. In each triad 
(e.g., panda, monkey, banana), there were 
two pictures that could be paired on the 
basis of taxonomic similarity (in this triad, 
panda and monkey are both animals), and 
there were two that could be paired on the 
basis of functional relationships (in this triad, 
the monkey eats the banana). When asked 
which two of the three pictures were most 
closely related, U.S. college students pre- 
ferred to group “on the basis of common cat- 
egory membership: Panda and monkey fit in 
to the animal category.” The Chinese par- 
ticipants showed a preference for grouping 
on the basis of thematic relationships (e.g., 
monkey and banana) and justified their an- 
swers in terms of relationships: “Monkeys 
eat bananas” (Ji, Zhang, & Nisbett, 2002, 
p. 140-141). This same cross-cultural differ- 
ence developed in childhood (Chiu, 1972). 
But, again, cultural preferences do not nec- 
essarily exclude the development of a minor 
mode. Illustrating this point, a study by Wis- 
niewski and Bassok (1999) indicates that, 
in the absence of a forced choice between 
the taxonomic similarity and functional re- 
lationships, U.S. college students can and do 
use both modes of thought as an implicit 
basis for similarity judgments and other cog- 
nitive operations. 

Perhaps the most basic difference be- 
tween the two modes of thinking is the 
collectivistic tendency to contextualize the 
world of objects in a web of social relations 
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world of physical objects as operating in its 
own plane of reality. The former is what we 
saw in the causal reasoning among the un- 
schooled Wolof children; the latter is what 
we expect in the world of physical science. 
These two modes of thinking about things 
are socialized very early (Bakeman et al., 
1990; Clancy, 1986; Fernald & Morikawa, 
1993; Rabain, 1979; Rabain-Jamin, 1994; 
Zempléni-Rabain, 1973). 


Cross-Cultural Conflict in What 
Counts as Thinking 


When families with a collectivistic cultural 
heritage emmigrate to an individualistic so- 
ciety, the two paradigms can come into 
sharp conflict, particularly at school. Cul- 
tural models not only have values attached 
to them — what counts as good and bad, what 
takes priority over what — but they also have 
epistemologies — what counts as knowledge. 
These cultural models are so basic they nor 
mally remain implicit. As long as everyone 
interacting in the same social world shares 
the same model, the implicit quality of the 
models does not cause a problem. In fact, it 
provides an underlying set of shared assump- 
tions that makes social life — for example, life 
in school — run smoothly. The next example 
is about what happens in a bicultural class- 
room when teachers and learners have differ- 
ent implicit understandings of what counts 
as thinking. 


In a pre-kindergarten class, the teacher 
held an actual chicken egg. She asked 
the children to describe eggs by think- 
ing about the times they had cooked and 
eaten eggs. One of the children tried three 
times to talk about how she cooked eggs 
with her grandmother, but the teacher 
disregarded these comments in favor of 
a child who explained how eggs are 
white and yellow when they are cracked. 

(Greenfield, Raeff, & Quiroz, 1996). 


The two features of this incident — the first 
child’s emphasis on a family-based story and 
the teacher’s disregard and devaluation of 
the child’s seemingly unscientific answer — 


grant Latino students. But what is really hap- 
pening here? 

Our theoretical analysis rests on the fol- 
lowing two points: What counts as thinking 
for the teacher is thinking about the phys- 
ical world apart from the social world. It is 
the teacher’s definition of scientific thinking, 
and, in her mind, this is a science lesson. 
Her focus is on one part of her instruc- 
tions, “Describe eggs.” The child, in con- 
trast, is responding more to the other part 
of the teacher’s instructions — “Think about 
the times you have cooked and eaten eggs” 
and, based on a different set of assump- 
tions about what counts as thinking, focuses 
on the social aspect of her experience with 
eggs, in particular, a family experience. This 
is the first aspect of the misunderstanding 
and cultural mismatch between teacher and 
learner. 

The second aspect of the mismatch is that 
the child who was passed over is providing 
a narrative, also valued in her home culture, 
whereas the teacher is expecting a simple 
statement of fact. Implicitly, the teacher is 
making Bruner’s distinction between narra- 
tive thought and logical-scientific thought. 
Bruner’s analysis is very relevant here: 


There appear to be two broad ways in 
which human beings organize and manage 
their knowledge of the world, indeed struc- 
ture even their immediate experience: One 
seems more specialized for treating of phys- 
ical “things,” the other for treating people 
and their plights. These are conventionally 
known as logical-scientific thinking and 
narrative thinking. (Bruner, 1996, p. 39). 


The child who talks about cooking and eat- 
ing eggs with grandmother is responding in 
the narrative mode; but the teacher expects 
the logical-scientific mode: “What are the 
bare facts about eggs?” she wants to know. 
Narrative is, in the dominant culture, associ- 
ated with the humanities, logical-scientific 
thought is associated with the sciences. As 
Bruner says, the value of logical-scientific 
thinking “is so implicit in our highly tech- 
nological culture that its inclusion in school 
curricula is taken for granted” (Bruner, 1996, 
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egg incident shows, the narrative mode be- 
comes invisible to the teacher. 


Logic 


The same type of contrast applies to logi- 
cal thought (see Evans, Chap. 8). Deductive 
logic is intrinsically decontextualized from 
its content (Nisbett et al., 2001; Nisbett, 
2003). We therefore would expect it would 
be part of individualistic but not collec- 
tivistic habits of thought. Instead, a col- 
lectivist might recontextualize a deductive 
problem. This phenomenon was first iden- 
tified by Luria in the 1930s with unedu- 
cated Soviet peasants in Central Asia (Luria, 
1971). Inspired by Luria, Cole et al. (1971) 
gave such problems to nonliterate Kpelle 
adults in a rural area of Liberia. Here is 
an example of a deductive logic problem 
and how the participant refuses to deal 
with the decontextualized structure and, 
instead, recontextualizes it, first by ask- 
ing more questions concerning context and 
then by applying his own experience to 
the problem: 


EXPERIMENTER: At one time spider went to a 
feast. He was told to answer this question 
before he could eat any of the food. The 
question is: Spider and black deer always 
eat together. Spider is eating. Is black deer 
eating? 

suBsEcT: Were they in the bush? 
EXPERIMENTER: Yes. 

suBJECT: They were eating together? 
EXPERIMENTER: Spider and black deer always 
eat together. Spider is eating. Is black deer 
eating? 

supsecT: But I was not there. How can I 
answer such a question? 

EXPERIMENTER: Can’t you answer it? Even if 
you were not there you can answer it. 
supsEcT: Ask the question again for me to 
hear. 

EXPERIMENTER: (repeats the question) 
supsect: Oh, oh black deer was eating. 
EXPERIMENTER: Black deer was eating? 
SUBJECT: Yes. 

EXPERIMENTER: Black deer was eating? 
SUBJECT: Yes. 

EXPERIMENTER: What is your reason for say- 
ing that black deer was eating? 


ways walks about all day eating green 
leaves in the bush. When it rests for a while 
it gets up again and goes to eat. 

(Cole et al., 1971, p.187). 


In essence, this participant rejects the ab- 
stract, decontextualized structure of the log- 
ical problem. This type of response was 
typical of a group of nonliterate Kpelle 
adults. In line with our notion of school 
as promoting an individualistic worldview, 
Kpelle high school students generally an- 
swered the logical problems in the way the 
researchers had in mind — as decontextual- 
ized logical deductive problems. 

Again, if this distinction is typical of the 
two paradigms of thought, it should apply 
to other groups who might differ on the 
individualism—collectivism worldview. Us- 
ing different methods, Nisbett and his col- 
leagues showed that East Asians, like the 
Kpelle, rejected decontextualized abstract 
logic and preferred to reason on the basis 
of experience (Nisbett et al., 2001). 


Visual Pattern Construction: A Case 
of Historical Change 


The worldwide direction of change on 
all critical demographic variables — to- 
ward greater population density, formal ed- 
ucation, technology, and commerce-based 
wealth — yields an historical push toward the 
pole of individualism. I will use the domain 
of visual representation to provide an exam- 
ple of how historical change can move cog- 
nition in the direction of the individualis- 
tic paradigm of thought. One of the marks 
of a collectivistic cultural system is respect 
for elders and their traditions. The individ- 
ualistic side of this coin places a value on 
novelty and innovation. The typical econ- 
omy in which respect for elders predomi- 
nates is agricultural subsistence. Innovation, 
in turn, is an important value in commercial 
entrepreneurship. An experiment demon- 
strated how a shift from one economy to 
another affected the representation of cul- 
turally novel patterns. 
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sentation experiment in a Zinacantec Maya 
community of Chiapas, Mexico (Green- 
field & Childs, 1977) that involved, among 
other things, continuing both culturally 
novel and culturally familiar (from tradi- 
tional weaving) striped patterns. The ex- 
perimenter would place sticks of differ 
ent colors in a rectangular wooden frame, 
providing three repetitions of the pattern 
(for example, green, green, green, yellow 
would be a single repetition of one of the 
patterns). She would then ask the sub- 
ject to continue the same pattern. At that 
time, the dominant economy was agricul- 
tural subsistence with relatively little cash 
or commerce. 

I returned to the community in 1991 af- 
ter a period of economic development in 
which commercial entrepreneurship and a 
cash economy had grown greatly with a 
corresponding decline in agricultural subsis- 
tence. I predicted that skill in continuing 
novel (not familiar) patterns would have in- 
creased, and this is exactly what I found. 
Even more interesting, I was able to relate 
this skill with novel representations directly 
to participation in commerce. Change had 
been uneven, and children whose families 
were most involved in commercial activities 
in both their business dealings and as con- 
sumers showed the most skill in constructing 
the novel patterns. Structural equation mod- 
eling indicated a causal relationship between 
correct completion of the novel patterns and 
commercial involvement. 

At the same time in this community, 
where weaving was the most important skill 
learned by all girls, there had been a shift 
in woven patterns from tradition to novelty. 
At the earlier period, there was a closed set 
of about four patterns that girls and women 
wove for clothing and other utilitarian pur- 
poses. By the time we went back in 1991, 
the basic patterns still existed, but they had 
been supplemented by an ongoing process 
of innovation through girls and women who 
created an infinite number of woven and em- 
broidered designs. So skill in representing 
culturally novel patterns in our experiment 
was a reflection of change in the culture as a 


ture to money and commerce. 

In terms of the socialization processes 
that could develop these new cognitive 
styles, we found an historical change in 
weaving apprenticeship that also had moved 
toward a more individualistic model. In 
commercial families, weaving apprentice- 
ship had, between 1970 and the early 1990s, 
moved from help and guidance from the 
teacher to a more independent trial-and- 
error learning process for the novice weaver. 
Moreover, we also found a correlation be- 
tween the more independent, individual- 
istic mode of weaving apprenticeship skill 
and continuing the novel patterns in our 
experiment. 

So these basic cultural paradigms of 
thinking are not constant. They are adapta- 
tions to social conditions, including social- 
ization processes, that change over time. As 
the world becomes more commercial, more 
dense, and more formally educated, the Zin- 
acantecs illustrate this worldwide trend from 
amore collectivistic to a more individualistic 
paradigm of thought. 


CONCLUSIONS AND FUTURE DIRECTIONS 


Identifying two basically different paradigms 
of thought, value, and behavior has linked 
together phenomena in the domain of cul- 
ture and thinking that were once consid- 
ered unrelated. With this linking thread has 
come deeper understanding of basic cul- 
tural differences. Although providing theo- 
retical coherence, it has also removed some 
of the ethnocentrism from earlier accounts 
of difference, in which, for example, collec- 
tivistic forms of categorization, reasoning, 
and logic were considered the absence of 
Western skills rather than as examples of a 
different set of values about the nature of 
intelligence. 

The primary omission in the preced- 
ing account is probably the ecocultural ap- 
proach to everyday cognition and particu- 
larly the role of cultural artifacts in thinking. 
For good reviews from these perspec- 
tives, I recommend Everyday Cognition by 
Schliemann, Carraher, and Ceci (1997) and 
Culturally Situated Cognition by Wang, Ceci, 
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body of work generated by these approaches 
is not at all antithetical to the theoretical 
paradigm presented here. In the future, I 
believe further theoretical integration will 
take place. 
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CHAPTER 28 


Legal Reasoning 


Phoebe C. Ellsworth 


For more than a century, lawyers have writ- 
ten about legal reasoning, and the flow of 
books and articles describing, analyzing, and 
reformulating the topic continues unabated. 
The volume and persistence of this “unre- 
lenting discussion” (Simon, 1998, p. 4) sug- 
gests that there is no solid consensus about 
what legal reasoning is. Legal scholars have 
a tenacious intuition — or at least a strong 
hope — that legal reasoning is distinctive, 
that it is not the same as logic, or scientific 
reasoning, or ordinary decision making, and 
there have been dozens of attempts to de- 
scribe what it is that sets it apart from these 
other forms of thinking. These attempts gen- 
erate criticism, the critics devise new formu- 
lations that generate further criticism, and 
the process continues. In this chapter, I de- 
scribe the primary forms of legal reason- 
ing, the most important schools of thought 
about legal reasoning, and some of the ma- 
jor differences between legal reasoning and 
scientific reasoning. 

The first question is, “Whose legal reason- 
ing are we talking about?” Jurors are given 
instructions on the law at the end of every 
trial and are asked to apply that law to the 


evidence they've heard to reach a verdict. 
They are asked to engage in “legal reason- 
ing.” Clients approach their attorneys with 
rambling stories and a strong, if somewhat 
vague, sense of injustice, and it is the at- 
torney’s job to figure out the laws, prece- 
dents, and facts that most favor the client 
and to integrate them into a persuasive case. 
This task involves legal reasoning, but the 
reasoning is driven by the desired outcome. 
The goal is not to reach the right decision 
but to make the best argument for one side. 
The evidence, as orchestrated by the lawyers 
and the legal arguments they make, form 
the raw materials for the judge’s decision, al- 
though judges (like juries) may also draw on 
their own background knowledge and expe- 
rience and their own interpretations of the 
evidence and (unlike juries) their own un- 
derstanding of the law. 

When scholars write about “legal reason- 
ing,” they are writing about judges. The 
lawyer does not have to decide the case, 
but only to make the strongest appeal for 
one side; lawyers’ reasoning is discussed in 
courses and writings on advocacy. Jurors in- 
terpret the evidence to decide what actually 
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in the judge’s instructions to reach a verdict. 
The judge must also seek out the appropri- 
ate legal authority, deciding which laws and 
previous cases are applicable. Jurors are not 
supposed to reason about the law itself; that 
is the task of the judge. Judges are trained in 
the law, they know the statutes and prece- 
dents, and they have the experience of judg- 
ing many cases and reading the decisions of 
other judges. Jurors do not provide reasons 
for their verdicts; judges often do. Finally, 
much of what is written about legal rea- 
soning is about appellate court decisions, in 
which judges are primarily concerned with 
legal procedure and the law itself, not about 
who wins and loses, and in which they al- 
most always must provide legal explanations 
for their decisions. 

In the subsequent historical section, I de- 
scribe how basic visions of the nature of le- 
gal reasoning have changed over time. Most 
judges, if they thought about their thought 
processes at all, have probably accepted the 
commonsense background theory prevalent 
in the legal culture of their era. Some, how- 
ever, including some of the greatest judges, 
have recognized that they really can’t ex- 
plain how they reach decisions (Holmes, 
1897; and cf. Nisbett & Wilson, 1977). In 
1921, Benjamin Cardozo began his classic 
work, The Nature of the Judicial Process, with 
the observation that “[A]ny judge, one might 
suppose, would find it easy to describe the 
process which he had followed a thousand 
times and more. Nothing could be farther 
from the truth” (1921, p. 9). 

But that does not mean there are no com- 
monly accepted characteristics of legal rea- 
soning. There are. The problem that vexes 
legal scholars is that they are incomplete. 
Although they undoubtedly influence judi- 
cial reasoning, they are insufficient either 
to predict future outcomes or to provide 
a fully satisfactory account for past ones. 
The two most common reasoning strate- 
gies, taught in every law school course on 
legal reasoning and writing, are the deduc- 
tive method (rule-based reasoning) and the 
analogical method (case-based reasoning). 
These strategies are not unique to legal rea- 


lation to scientific reasoning as well. What is 
distinctive about these forms of reasoning in 
the legal context is not so much the process 
but the context, the raw materials to which 
the processes are applied, and the nature of 
the rules. 


Deductive and Analogical Reasoning 
in Law 


Deductive (Rule-Based) Reasoning 


In deductive scientific reasoning (see Dun- 
bar & Fugelsang, Chap. 29), there is a gen- 
eral law or theory, and the scientist uses that 
theory to infer what will happen in some 
particular fact situation, makes a prediction, 
and designs an experiment to test it. If the 
prediction is not confirmed, there are three 
possibilities: The deduction was flawed, 
the experiment was flawed, or the theory 
is flawed. In deductive legal reasoning, the 
decision maker begins with a specific set of 
facts, looks at the law that applies to those 
facts, and reaches a verdict. If Joe’s Liquor 
Store sells beer to 16-year-old Richard, and 
there is a law prohibiting the sale of alco- 
hol to anyone under the age of 21, then Joe’s 
Liquor Store is guilty. The reasoning is ba- 
sically syllogistic, and in many cases the ap- 
plication of the law is unproblematic (see 
Evans, Chap. 8). These are called easy cases. 

In practice, there are many ways in 
which ambiguity can creep into this appar- 
ently clear logical process. First, the decision 
maker is faced with a specific set of facts. If 
he or she is a judge, there are almost always 
two versions of the facts. It is the attorneys’ 
job to organize the facts in a way that fits the 
legal outcome they wish to achieve, and they 
do this by emphasizing different facts and, 
often, different legal precedents. “[T]he law 
determines which facts are relevant while at 
the same time, the facts determine which 
law is relevant” (Burton, 1995, p.141). There 
may be more than one law that is poten- 
tially applicable. There may be several statu- 
tory provisions that might be relevant, and 
the two opposing counsel may argue that a 
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trol this case. The statute itself may violate a 
higher rule, such as the state or federal con- 
stitution. The rule may be ambiguous, as in 
a ban on “excessive noise,” or the application 
of the “reasonable person” standard (“Would 
a reasonable person have believed that her 
life was in danger?”). 

In preparing a case, an attorney will go 
back and forth between developing a co- 
herent version of the facts that fits the law 
and conducting legal research to find out 
which laws frame the facts in the best pos- 
sible way. The judge, faced with two com- 
peting arguments, may choose one of them, 
or may bring in additional factual interpreta- 
tions or legal considerations not mentioned 
by either of the parties. Thus, even the ap- 
parently simplest form of legal reasoning — 
deciding whether the law covers the specific 
fact situation — is often quite complicated in 
practice. The commonsense idea that there is 
a behavior, there is a law, and the ques- 
tion is “Does the behavior conform to the 
law?” is much too simple to apply to interes- 
ting cases. 


Analogical (Case-Based) Reasoning 


In the Anglo-American common law 
tradition,’ cases are decided by examining 
the patterns of decisions in earlier, related 
cases. No case has meaning in isolation, 
and general rules and propositions are 
useless without “the heaping up of concrete 
instances” (Llewellyn, 1930, p. 2), except in 
very simple cases. A somewhat similar form 
of reasoning occurs in science when a scien- 
tist examines a series of studies with appar- 
ently inconsistent results and tries to come 
up with a general principle that will explain 
the inconsistencies. In research on social 
facilitation, for example, some researchers 
found that people performed better on a task 
when other people were around, but other 
researchers found that people performed 
better when they were alone. In 1965, 
Robert Zajonc resolved this controversy by 
showing that the emotional arousal caused 
by the presence of others enhanced perfor- 
mance on well-learned tasks but impaired 


He applied a more general principle that 
explained the apparently contradictory re- 
sults of past research and made sense of the 
field. He then went on to devise a situation 
in which the new principle could be tested. 

The judge begins where the scientist ends, 
with a specific situation in which the out- 
come must be decided — not predicted and 
tested but decided by examining the sim- 
ilarities and differences between this new 
case and the previous cases and choosing an 
outcome that corresponds to the holdings 
of the cases it most resembles. In the ad- 
versarial system, the lawyers emphasize the 
prior cases that were decided the way they 
want this one to be decided, finding crucial 
differences in the prior cases that went the 
“wrong way” so as to argue that their hold- 
ings are inapplicable in the present context. 
The lawyers have a certain leeway in their se- 
lection of which facts to emphasize, in their 
interpretation of the facts, and in their de- 
scription of the legal significance of those 
facts (Llewellyn, 1930, p. 70). Like the scien- 
tist, the lawyer may identify some principle 
that explains why the current case should 
be considered an example of the first group 
rather than the second. The judge examines 
the strengths and weaknesses of the argu- 
ments of the two parties and either chooses 
between them or develops a different princi- 
ple for placing the present case in the context 
of the past ones. 

When legal educators claim that the basic 
mission of the first year of law school is to 
train the student to “think like a lawyer,” it 
is this sort of analogical reasoning they gen- 
erally have in mind — the ability to spot the 
factual and legal similarities and (more im- 
portant) differences between the case un- 
der study and related previous cases and 
to recognize which similarities and differ 
ences are relevant (e.g., the defendant's state 
of mind) and which are not (eg., the de- 
fendant’s name). This entails defining the 
universe of possibly applicable cases and de- 
ciding which ones match the current case 
most closely and which, although apparently 
similar, do not apply. The focus is on the 
particular cases, and the reasoning is more 
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tion of a general principle (Sunstein, 1996, 
p. 67; see Holyoak, Chap. 6, for further dis- 
cussion of analogical reasoning). 

Finally, as with deductive reasoning, the 
significance of a particular fact depends on 
its legal significance, and the significance 
of a particular law or previous holding de- 
pends on the exact fact pattern of the 
case. The legal reasoner must consider both 
simultaneously. 


Theories of Legal Reasoning 


Formalism’ 


That “legal reasoning” is considered to be a 
distinctive form of reasoning worthy of be- 
ing included as a separate topic in the Cam- 
bridge Handbook on Thinking and Reasoning 
is attributable in large measure to Christo- 
pher Columbus Langdell, who became the 
first Dean of the Harvard Law School in 
1870, and who revolutionized legal educa- 
tion. He introduced the case-based tech- 
nique of teaching law; he created the image 
of the law faculty as a group of perma- 
nent scholars devoted to legal research, 
explicitly promoting the analogy to the fac- 
ulty of a science department; and he advo- 
cated a view of legal reasoning known as “le- 
gal formalism.” 

The essence of legal formalism is the idea 
that “a few basic top-level categories and 
principles formed a conceptually ordered 
system above a large number of bottom-level 
rules. The rules themselves were, ideally, the 
holdings of established precedents, which 
upon analysis could be seen to be discovered 
from the principles” (Grey, 1983, p. 11). In 
other words, there is a pyramid of rules with 
a very few fundamental “first principles” at 
the top, from which mid-level and finally a 
large number of specific rules could be de- 
rived. The legal decision maker, faced with 
a case to be decided, would study the body 
of law and discover the rule that determined 
the correct result. 

In 1870, science represented the pin- 
nacle of human intellectual achievement, 


discipline rather than a mere trade, Langdell 
embraced the idea that law is a science 
(Langdell, 1880). He did not originate this 
view, which can be found in Blackstone’s 
Commentaries and earlier (Kennedy, 1973), 
but he promulgated it enthusiastically. An 
obvious problem with this analogy is that 
in law there is no means of experimenta- 
tion, no access to previously unknown data. 
The “data” consisted of the writings of ear- 
lier judges: “We have constantly inculcated 
the idea that the library is the proper work- 
shop of professors and students alike; that it 
is to us all that the laboratories of the uni- 
versity are to the chemists and physicists, the 
museum of natural history to the zoologists, 
and the botanical gardens to the botanists 
(Langdell, 1887, p. 124; emphasis added). 
The data were what judges had said, and new 
data were what new judges said, based on 
their readings of their predecessors. Langdell 
did not argue that law as it existed actually 
achieved the beautiful hierarchical organi- 
zation from clear, highly abstract principles 
down to lower levels that would finally allow 
precise derivations that would fit any new set 
of particular facts; creating such an arrange- 
ment was a goal of legal science. 

Of course this view of science as a closed 
deductive system strikes most modern sci- 
entists as unrealistic and simplistic — a view 
of science that we were taught in eighth 
grade but that rarely seems like a descrip- 
tion of what we actually do or how we 
actually think. The behavioral sciences espe- 
cially (and it seems natural to us that if law 
is to be considered a science at all it should 
be a behavioral science) seem a poor fit for 
such an abstract deductive model of reason- 
ing. Even in 1870, the excitement of observa- 
tion, empiricism, and induction were rapidly 
replacing earlier deductive views of science. 

Langdell’s model of science was more like 
the taxonomic system of Linnaeus than like 
empirical science. Families of plants and an- 
imals were organized under phyla (the fun- 
damental principles), genera under families, 
and species under genera. During the explo- 
rations of the eighteenth and nineteenth cen- 
turies, an astonishing variety of new plant 
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one could be compared with others at the 
species level and classified appropriately in 
its place in the ruling structure. In the same 
way, each new legal case could be examined 
for its similarities and differences to previ- 
ously decided cases, which in turn had been 
classified according to the general taxon- 
omy, and so could be decided accurately. In 
law, “the fundamental principles of common 
law were discerned by induction from cases, 
rules of law were then derived from princi- 
ples conceptually, and, finally, cases were de- 
cided, also conceptually, from rules” (Grey, 
1983, 19). 

There were critics of legal formalism from 
the very beginning. The alternative view is il- 
lustrated in two famous remarks by Oliver 
Wendell Homes, Jr.: “The life of the law 
has not been logic: It has been experience” 
(Holmes, 1881, p. 1), and “general princi- 
ples do not decide concrete cases” (dissent- 
ing opinion in Lochner v. New York, 1905, p. 
76). Holmes and, later, critics such as Pound, 
Llewellyn, and Cardozo argued that legal 
principles were not “discovered” by careful 
research into the rules and principles, and 
that such research, however diligent, would 
not yield definite and incontrovertible an- 
swers in any but the easiest cases. Instead of 
clear distinctions between the cases decided 
in one way and those decided in the other 
(for the plaintiff or the defendant in a med- 
ical malpractice case, for example), there is 
overlap and fuzziness at the boundary and, 
in the end, the judge creates the defining dis- 
tinction rather than discovering it (Cardozo, 
1921, p. 167). The distinctions were often 
arbitrary, not logical, and influenced by the 
judge’s own sense of what the right outcome 
should be. The fundamental principles and 
legal rules were important and provided con- 
siderable guidance to the judge but, in most 
cases, they were insufficient to determine 
the outcome. The certainty and sense of in- 
evitability expressed in judicial opinions was 
quite unjustified. As time goes by and the 
legal landscape becomes dense with more 
and more intermediate cases, the failures of 
formalism become increasingly apparent. As 
Holmes put it 


distinction which is a clear one when stated 
broadly. But as new cases cluster around 
the opposite poles, and begin to approach 
each other, the distinction becomes more dif- 
ficult to trace; the determinations are made 
one way or the other on a very slight pre- 
ponderance of feeling, rather than articu- 
late reason; and at last a mathematical 
line is arrived at by the contact of contrary 
decisions, which is so far arbitrary that it 
might equally well have been drawn a little 
further the one side or the other (Holmes, 


1873, p. 652). 


Although the idealistic theory behind 
formalism has largely been abandoned (cf. 
Kennedy, 1973; Gordon, 1984; Grey, 1983; 
Simon, 1998), its categories and its ana- 
lytic methods persist. Its classifications are 
still robust — substantive versus procedural 
law; contracts, torts, property. They deter- 
mine how the first year of law school is 
structured. No comprehensive new organi- 
zational scheme has replaced the categories 
of formalism, and they therefore continue to 
“influence judgment much as the agenda for 
a meeting influences the results of its delib- 
erations” (Grey, 1983, p. 50). 

The tenets of legal formalism still ex- 
ercise a strong influence on the way judi- 
cial opinions are written. Decisions typically 
are presented as the inevitable consequence 
of a careful analysis of the facts and 
the applicable law based on the classifi- 
cation of this case in relation to previous 
cases. The correct decision and the govern- 
ing principles are described as discovered, 
not created, by the judge (Schauer, 1995, 
p. 642, note 23), and are expressed with 
great certainty, as though there were no 
room for doubt. “It seems that this neo- 
formalist form of jurisprudence — typified 
by a self-reported experience of constraint, 
high confidence and singular correctness, 
all couched in the rhetoric of closure — is 
the predominant, albeit unofficial, mode of 
judicial reasoning in current American legal 
culture” (Simon, 1998, p. 11). In part, this 
persistence is attributable to the strong be- 
lief that the law requires stability. For peo- 
ple to have faith in the legal system, judges’ 
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to make predictable, logical decisions there 
must be a fixed framework from which 
those decisions are derived. A major differ- 
ence between law and science, as discussed 
subsequently, is that uncertainty and change 
are a sign of a healthy scientific climate; 
they would definitely not signal a healthy 
legal climate. 


Legal Realism 


Legal realism arose in opposition to formal- 
ism and can be seen as an extension and elab- 
oration of Holmes’s early skepticism. Legal 
realists rejected the formalist ideas that the 
law was a self-contained logical system pro- 
viding for the scientific, deductive derivation 
of the right answer in all new cases. They 
regarded this view as a vain daydream dis- 
connected from the real world influences on 
legal decision makers — hence the label “legal 
realism.” 

In a strict formalist analysis, two different 
judges should always judge the same case 
in the same way unless one of them was 
mistaken in his? understanding of the facts 
or the law. Clearly this was not the case. 
In the nineteenth century, as now, courts 
were often divided. There were judges in 
the majority and there were dissenters, and 
no one seriously argued that the dissenters 
were incompetent or in need of retraining. 
Of course the formalists did not believe this 
was the way the world really worked, but 
they did believe that the legal system could 
approximate that ideal and that it was an 
ideal worth striving for. The legal realists be- 
lieved that it was an impossible ideal and that 
it was a waste of time to strive for it. 

According to the legal realists, instead of 
reflecting an abstract set of nearly immutable 
principles, the law reflects historical, social, 
cultural, political, economic, and psycholog- 
ical forces, and the behavior of individual 
legal decision makers is a product of these 
forces. It therefore is not surprising that dif- 
ferent judges, with different goals and back- 
grounds, should decide cases differently, and 
contrary decisions do not imply that some 
judges must be “wrong.” 


“Sociological Jurisprudence,” which was ex- 
pounded most explicitly by Roscoe Pound 
(1912). Like Holmes, Pound felt that the 
“mechanical jurisprudence” of the formal- 
ists was out of touch with social real- 
ity and that legal scholarship and judicial 
norms were standing still, out of touch with 
exciting developments in philosophy and, 
particularly, the social sciences. “Jurispru- 
dence,” he argued, “is the last in the march 
of sciences away from the method of de- 
duction from predetermined conceptions” 
(Pound, 1909, p. 464). The strict doctrinal 
approach blinded legal writers to two essen- 
tial considerations: first, the purposes of the 
law — the goal of doing justice rather than 
following the letter of the law; and second, 
the social, cultural, and psychological factors 
that influenced behavior, including the be- 
havior of lawmakers and judges. Blind adher- 
ence to the abstract law-on-the-books might 
make for greater certainty and predictability, 
but “reasonable and just solutions of individ- 
ual cases” were “too often sacrificed” (Pound, 
1912, p. 515). The law treated all individuals 
as equivalent regardless of their social back- 
ground or position. Thus, for example, the 
right of an employee to quit was legally the 
same as the right of the employer to fire him. 
Both were free agents enjoying the “liberty of 
contract.” But of course the employer could 
easily find another employee, but the em- 
ployee would have lost his livelihood and 
might have a very hard time finding another 
job. The law’s refusal to acknowledge these 
obvious social truths was a major stimulus to 
sociological jurisprudence. 

Pound argued that legal scholarship and 
judicial decisions should “take more ac- 
count, and more intelligent account, of the 
social facts upon which law must proceed 
and to which it is to be applied” (i912, 
p. 513). The focus should not be on the ab- 
stract content of the laws but on how they 
actually work. It is important to consider 
the purpose of laws and to modify them if 
these purposes are not being achieved. And 
judges should regard the law as suggestive 
rather than determinative of their decisions: 
If strict application of the law would result 
in an outcome that is unjust or contrary to 
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cause of justice is appropriate. 

The basic views of Holmes and Pound 
were quite similar — pragmatic and open- 
minded. Pound, however, was a far stronger 
proponent of an interdisciplinary solution 
to the problems of formalism. The social 
sciences were very much on the rise at 
the beginning of the twentieth century and 
seemed “progressive” in a way that law was 
not. Their ideas stretched the imaginations 
of the more intellectually curious law pro- 
fessors and challenged some of the most 
fundamental assumptions of the law. The so- 
ciologists (the most influential group) sug- 
gested that the equality of all assumed by 
the law (e.g., the “liberty of contract”) was a 
myth because status and power significantly 
affected a person’s choices, the anthropolo- 
gists revealed a wide range of peaceful so- 
cieties with entirely different kinds of legal 
systems, and psychologists raised questions 
about the essential legal concepts of free will 
and responsibility, suggesting that behavior 
was determined by psychological and social 
factors beyond the control of the individual 
(Green, 1995). 

The period identified as the flowering 
of legal realism was the period between 
the wars (Fisher, Horwitz, & Reed, 1993). 
Holmes and Pound were the inspirational 
figures from the past,+ but now there were 
enough like-minded scholars so they could 
legitimately be called a “school” or a “move- 
ment,” although never an organization. Like 
the cognitive psychologists who shook off 
the shackles of behaviorism in the 1960s and 
19708, they were an eclectic group united 
mainly by their opposition to the old ways. 
Some tried to do empirical research, some 
were political activists (and some eventually 
became part of the New Deal government), 
some continued as legal scholars but preach- 
ing a new faith, and some were articulate 
gadflies. Some were and are highly respected 
figures in the history of legal scholarship, 
some were but are no longer, and some were 
always seen as fringe elements. 

As with their predecessors, their primary 
unifying theme was a rejection of the old 
ways and a passionate belief that legal doc- 
trine played a limited role in legal decision 


be. Karl Llewellyn, one of the most impor- 
tant figures in the group, argued that law was 
about “disputes to be settled and disputes to 
be prevented” (1930, p. 2), not about rules; 
about what legal decision makers do, not 
what they say. Legal rules were regarded as, 
at best, post hoc justifications and, at worst, 
criteria that could lead judges to unjust de- 
cisions. Advocates in a trial could usually 
describe the facts and the law so as to pro- 
duce coherent, complete, persuasive argu- 
ments for two diametrically opposite con- 
clusions. Llewellyn even wrote an article on 
statutory interpretation showing that each 
of 28 basic legal propositions could be ar- 
gued either way: “A statute cannot go beyond 
its text’/“To effect a purpose a statute may 
be implemented beyond its text”; “Where 
design has been distinctly stated no place 
is left for construction’/“Courts have the 
power to inquire into real — as distinct from 
ostensible — purposes” (Llewellyn, 1950, 
PP. 401, 403). 

The agenda of the legal realists was both 
descriptive and prescriptive. According to 
Felix Cohen, “Fundamentally, there are only 
two significant questions in the field of law. 
One is, ‘How do courts actually decide cases 
of a given kind?’ The other is, ‘How ought 
they to decide cases of a given kind?’”(1935, 
p. 824). The answer to the descriptive ques- 
tion was that courts do not decide cases on 
the basis of laws because the law always 
allows for multiple answers. In considering 
what sort of forces do influence case out- 
comes, different scholars emphasized social 
and cultural forces (Cohen, 1935; Lasswell, 
1930; Yntema, 1928), unconscious psycho- 
logical drives (Frank, 1930), or just a pro- 
cess of intuition that eventually culminated 
in a Gestalt-like “Aha effect” after long ru- 
mination (Hutcheson, 1929). These influ- 
ences affect the assessment of the actual 
facts of the case — the credibility of the 
witnesses, the plausibility of the stories, as 
well as the judge’s “sense of how the law 
ought to respond to these facts” (Fisher, Hor- 
witz, & Reed, i993, p. 165). Legal real- 
ists were ridiculed as believing that judicial 
decisions depended on what the judge ate 
for breakfast. However, the realists generally 
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idiosyncratic or unpredictable. “Law is not 
a mass of unrelated decisions nor a prod- 
uct of judicial bellyaches. Judges are hu- 
man, but they are a particular breed of 
humans, selected to a type and held to ser- 
vice under a potent system of governmen- 
tal controls” (Cohen, 1935, p. 843). Because 
most judges come from the same social class, 
receive the same legal education, and are 
subject to the same social and historical in- 
fluences and the same role demands, their 
decisions will resemble each other. 

The intellectual enterprise of legal schol- 
arship, therefore, should be to describe the 
actual behavior of courts, taking account of 
the broader social context. The realists were 
confident that this behavior would not be 
predictable from written legal doctrine or 
statutes. Instead, the legal rules and con- 
cepts would turn out to be consequences, 
rather than causes, of judges’ behavior. To 
understand how judges reach their decisions, 
it is important to analyze their social back- 
grounds, previous experience, and role de- 
mands and the general political, social, and 
economic pressures of the times. Because 
these same forces affected the behavior of 
the parties of the case, the relation between 
the judge’s position in society and that of the 
litigants should also be explored. This gen- 
eral set of ideas was easy to demonstrate in 
particular cases. Then, as now, the opinions 
of individual judges on particular issues were 
often easy to predict. Defense lawyers “shop” 
for judges known to be sympathetic to of- 
fenders who resemble their client (judges 
who believe that drug laws are too harsh, for 
example). On some issues, it is easy to pre- 
dict Supreme Court Justices’ positions based 
on their previous opinions and their general 
ideology. Coming up with a more general 
mid-level theory, something between vague 
abstract statements about “social forces” and 
predictions of what a particular judge would 
say in a particular case, was a much greater 
challenge and one the realists never actually 
accomplished. 

The description of what courts actually 
do was supposed to explore not only the 
causes of judicial decisions but also their 
consequences. A study of consequences is es- 


ought [courts] to decide cases of a particular 
kind?” Judicial decisions affect human be- 
havior, often favoring one group’s interests 
over another, and they affect future judicial 
decisions. Careful study of these conse- 
quences would allow for better-informed ju- 
dicial decisions and better laws. 

Prescriptively, the realists argued first that 
in applying the law, judges ought to con- 
sider the purpose of the law and, second, 
that they should focus on the particulars 
of the case and compare it with the partic- 
ulars of preceding cases, rather than look- 
ing for broad general principles. Consid- 
eration of the purposes of the law was 
supposed to enhance the fairness and the 
consistency of decisions, and blind applica- 
tion to the rule without considering its pur- 
pose would lead to bad decisions (Llewellyn, 
1942). To facilitate this approach, legisla- 
tors and judges should make the reasons 
for the law explicit; to provide appropri- 
ate guidance to future judges: “Only the 
rule which shows its reason on its face has 
ground to claim maximum chance of contin- 
uing effectiveness” (Llewellyn, 1942, p.260). 
Because social conditions were constantly 
changing, however, judges should be free 
to revise and reject even rules with clearly 
stated purposes; the development of law, 
like the development of science, should be 
a never-ending process of examination and 
re-examination. 

Specific comparisons of the particular 
case to be decided and the facts of related 
cases, through analogical reasoning, was the 
preferred method. Just as a case read by it- 
self is meaningless (Llewellyn, 1930, p. 49), 
a case read with reference to the law and 
without reference to other cases was also 
meaningless. Close factual comparisons will 
reveal the empirically grounded rules and 
cultural beliefs that actually explain legal 
decisions because “legal rules are simply for- 
mulae describing uniformities of judicial de- 
cision” (Cohen, 1935, p. 848). Some of the 
realists believed that close examination of 
the prior body of cases required more than a 
reading of the cases alone. Some felt that an 
education in social science was necessary to 
fully understand the social forces influencing 
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legal researchers should create databases on 
the background of judges and their decisions, 
the frequency with which laws on the books 
were actually enforced, whether they are en- 
forced against some groups more than oth- 
ers, whether patterns of enforcement have 
changed over time (e.g., obscenity laws), and 
so on. 

The legal realists have been identified 
with a “social science” point of view, but this 
meant different things to different scholars. 
Most of them probably shared Pound’s be- 
lief that, although other scientific disciplines 
were making huge progress, law was stag- 
nating, backwards looking, and clinging to a 
static, deductive model that had been aban- 
doned by other sciences. The law, because it 
deals with ever-changing values, opportuni- 
ties, and norms of behavior should keep pace 
with these changes. Most also were some- 
what shaken by the ways in which sociology 
and psychology were undermining the no- 
tion of free will central to the law (Green, 
1995). Most of them agreed that the focus of 
attention should be on how judges think, not 
on the written rules. They were fairly unified 
in describing what was wrong with formalism 
but never fully agreed on the remedies and, 
indeed, proposed very few. 

Beyond this general sense that the law 
should develop as society develops and 
take general account of progress in the so- 
cial sciences, the realists followed different 
paths. Some more or less stopped there. 
For others, the “critical realists” in Horwitz’s 
(1992) terminology, social science mainly 
meant a concern with social policy. Politi- 
cally they were progressives, and flourished 
under the New Deal. Cardozo, Brandeis, 
Frankfurter, and Douglas followed Holmes 
to the Supreme Court, and several others 
moved to important positions in the New 
Deal administration. For them, the social sci- 
ence that mattered was the sociologists’ em- 
phasis on social class and a generally socialist 
view of what should guide the government 
and the courts. For them, as for many of the 
social scientists of the time, social science 
meant social activism. 

Another group, the “constructive real- 
ists” (Horwitz, 1992), believed that legal 


information about the causes and conse- 
quences of various rules, conducting in- 
terdisciplinary empirical research, and that 
courts should consider social science data 
in deciding cases. The method of mar- 
shaling social scientific evidence in argu- 
ing a case was pioneered by Louis Brandeis 
and Josephine Goldmark in the famous 
“Brandeis brief” in Muller v. Oregon (208 U.S. 
412). In arguing that it was constitution- 
ally permissible to restrict women’s work- 
ing hours to ten hours a day, they presented 
hundreds of excerpts from various articles 
and reports claiming that long working hours 
were damaging to women’s health. Most of 
these were not actually scientific reports, but 
they were an effort (successful) to force the 
court to consider the social facts involved 
in the legal question and the social conse- 
quences of the decision. The “Brandeis brief” 
is legendary, and the inclusion of social sci- 
ence research in legal arguments is now com- 
mon. Modern trial and appellate courts rou- 
tinely consider social science data, although 
their actual influence is probably less than 
most social scientists would like to believe 
(Ellsworth & Getman, 1986). 

There were some efforts to compile 
databases (Pound and Frankfurter, 1922; and 
cf. Schlegel, 1980) and a few attempts to ac- 
tually carry out systematic research projects. 
However, these attempts generally failed to 
achieve the grand purposes their authors had 
in mind. In writing a traditional law review 
article, the author typically knows what the 
conclusion is at the beginning; empirical re- 
search, as any honest scientist knows, often 
forces agonizing rethinking and sometimes 
produces data so ambiguous that nothing 
can be concluded. So, in 1928, the future 
Supreme Court Justice William O. Douglas 
conducted a study of business failures de- 
signed to produce revolutionary insights but 
ended up with two small, inconclusive arti- 
cles (Fisher, Horwitz, & Reed, 1993, p. 233). 
Underhill Moore, a Yale law professor in one 
of the three experimental law and social sci- 
ence interdisciplinary programs, attempted 
a behaviorist (Hullian) analysis of the ef- 
fects of parking tickets (Moore and Callahan, 
1943) that provoked intense ridicule even 
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“the nadir of idiocy” (1956, p. 400)]. Em- 
pirical research by legal scholars has slowly 
increased over the past 50 or 60 years, but 
at the time, the admonishments of the le- 
gal realists only produced a brief spate of at- 
tempts, nothing like a major change in orien- 
tation. It is still the case that some law pro- 
fessors regard empirical research as mindless 
and mechanical with data a crutch for those 
whose mental capacities are insufficient to 
reach the truth on their own. 

Although the excesses of Legal Realism 
are still parodied in well-worn clichés (such 
as the “what the judge had for breakfast” 
cliché), in the main, it has been absorbed 
into American legal thought; thus, only the 
excesses stand out as distinctive. Close com- 
parison of cases is the standard method of 
legal education, and consideration of the 
social context, purposes, and policy impli- 
cations of the law is common. The chal- 
lenge posed by the realists — the relative 
role of law versus social and personal con- 
siderations — still looms over the study of 
law and defines the questions. Databases 
are everywhere, especially in the criminal 
justice system, but also in the civil arena. 
The American Bar Association regularly pro- 
poses guidelines based on statistical data as 
do government commissions. No one still 
believes in strict Langdellian formalism, al- 
though many law courses are an uneasy 
blend of formalism and the considerations 
raised by the legal realists, and judicial opin- 
ions are written in formalist language. And 
the later developments of legal realism, al- 
though never quite mainstream, are thriv- 
ing. In 1935, Felix Cohen wrote that “It is 
reasonable to expect that some day even 
the impudencies of Holmes and Llewellyn 
will appear sage and respectable” (1935, 
p. 847), and that prophecy has certainly 
come true. 


Critical Legal Studies, Law and 
Economics, and the Law and 
Society Movement 


Although many of the ideas of the legal 
realists have been incorporated into the 
mainstream of law, there are three direct de- 


rents. One, called Critical Legal Studies, is 
a reincarnation of the Progressive political 
themes of Legal Realism, and the other two 
(the Law and Economics movement and the 
Law and Society movement) are develop- 
ments of the interdisciplinary social science 
endeavor. 

Law and Economics scholars are fairly 
traditional in terms of economic theory 
[e.g., Tversky, Kahneman, and the behav- 
ioral economists so far have had mini- 
mal influence (Kahneman & Tversky, 2000; 
Kahneman, Slovic, & Tversky, 1982; Thaler, 
1992)], taking as given the assumption that 
people rationally assess their circumstances 
and do what will maximize their own wel- 
fare. The potential criminal calculates the 
probabilities of getting caught, being pun- 
ished, and the potential severity of pun- 
ishment and weighs these considerations 
against the beneficial consequences of the 
crime (money, the extermination of a goal- 
blocking person) and accordingly decides 
whether or not to commit the crime. They 
attempt to fit legal decisions into a stan- 
dard economic framework and, if they do 
not fit, to argue that they should. Although 
they are often described as descendants of 
the legal realists, in some ways the Law 
and Economics movement bears a closer re- 
semblance to the formalists. It has a for- 
mal model with a set of first principles: “Be- 
havior always takes the form of constrained 
maximization. The actor chooses from some 
specified set of options, selecting the option 
that maximizes some objective function. In 
orthodox theory, consumers have preferences 
that are represented by a utility function, 
and they choose in a way that maximizes 
their utility...” (Kreps, 1990, p. 4, cited in 
Hanson & Yosifon, 2003). Explanations and 
recommendations follow deductively from 
the basic premises. Law and Economics has 
little to say about what is distinctive about 
legal reasoning; it is primarily another ex- 
ample of the economic model of reasoning 
in general. 

By contrast, the Law and Society schol- 
ars are open-minded, eclectic, and devoid 
of any theoretical mission. Instead, they are 
committed to the social science method of 
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and social context matter. Friedman (1986) 
has proposed that Law and Society is a field 
like “Area Studies” in which scholars from 
many disciplines study law the way scholars 
from many disciplines study Latin America 
or Southeast Asia. Their concern with con- 
text and actual behavior means that they 
are relatively uninterested in “purely intel- 
lectual forces — the role of legal thinkers, for- 
mal doctrine, philosophy and theory of law; 
the role of abstract ideas” (Friedman, 1986) 
because such forces are mainly epiphenom- 
ena, not fundamentally causal. A great deal 
of important and interesting work has come 
from this school, but it is not really about 
legal reasoning in general. In fact Law and 
Society scholars would reject the idea that 
there is such a thing as legal reasoning in 
general. 

Critical Legal Studies is the bad boy of 
the bunch, and in this regard it is more ob- 
viously connected to the Legal Realists in 
their role as iconoclastic rebels. Like the re- 
alists, they argue that interpretation of the 
law is subjective, and they emphasize the 
role of power and political ideology more 
strongly than most of the realists. Like the re- 
alists, they have been more effective as crit- 
ics than as authors of an alternative vision 
(Kennedy, 1997), and some of them have 
glorified “trashing” as a sufficient contribu- 
tion (Tushnet, 1984). In some ways, they 
resemble the postmodernists of other disci- 
plines, insisting that there is “no there there,” 
that all legal concepts, like all other social 
concepts, are socially constructed (except of 
course for power and dominance). 

However, some of their analyses of le- 
gal reasoning went beyond what the legal 
realists had produced. In arguing that the 
legal realists’ decisions were based on per- 
sonal and social values, not law, the legal 
realists didn’t quite get at the process by 
which a judge’s preference is turned into a 
legal justification. Is the judge’s reference 
to the law or precedent a “noble lie” in 
Dworkin’s (1986) terms, resorted to because 
personal preferences or partisan political 
preferences could never be publicly stated 
as good reasons for justifying a decision? 
Are judges simply unquestioningly follow- 


be justified by legal authority and precedent? 
Or are they totally unaware of their own 
biases? 

Duncan Kennedy, one of the founders of 
Critical Legal Studies, draws on the psychol- 
ogy of Kohler, Lewin, and Piaget to explore 
the thought processes of judges in a way 
that is less fuzzy and more nuanced than 
the general realist point of view (Kennedy, 
1986). His hypothetical judge is a politi- 
cal reformist, of course, who is faced with 
a conflict between what the law seems to 
require and “how I want it to come out”: 
“imagine that I think the rule that seems to 
apply is bad because it strikes the wrong 
balance between two identifiable conflict- 
ing groups, and does so as part of a gener- 
ally unjust overall arrangement that includes 
many similar rules, all of which ought in the 
name of justice to change” (Kennedy, 1986, 
p. 519). The judge may reinterpret the facts, 
reinterpret the legal precedents, reinterpret 
the basic purpose of the law in the light of 
social policy, or make other moves. Judges 
will also consider how the public and other 
judges will view their decision, and finally, 
they really do care about the law and prece- 
dent; thus, the dilemma is a real cognitive 
dilemma, not just a matter of imposing their 
personal political motives. The decision will 
become part of the law that other judges 
must consider when they decide cases, so 
the judge also must worry about its future 
ramifications. “Legal argument is the process 
of creating the field of law through restate- 
mentrather than rule application” (Kennedy, 
1986, p. 562). The thought process evolves 
in time, beginning as a conflict and ending 
as certainty. Once a strategy is chosen, the 
judge no longer can imagine any compelling 
counterargument. Simon recently updated 
this analysis in the light of more recent re- 
search in social and cognitive psychology and 
showed that it has considerable power even 
in cases in which the judge has no particular 
political motivation: An incoherent mass of 
contradictions develops into a coherent de- 
cision in which no opposing argument carries 
any weight, but all turn out upon close ex- 
amination to support the decision (Simon, 


1998). 
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pothesis confirmation, motivated informa- 
tion processing, ultimate overconfidence, 
and others — are not unique to legal rea- 
soners. They are true of us all, including 
scientists. Still, there are several important 
differences between legal reasoning and sci- 
entific reasoning. 


Differences Between Scientific 
Reasoning and Legal Reasoning 


As Llewellyn said, legal reasoning is not sci- 
entific reasoning, although it shares some an- 
alytic strategies, most notably the “method 
of comparison and difference” (Llewellyn, 
1930, p. 43) of, as we might say, “conver- 
gent and discriminant validity” (Campbell & 
Fiske, 1959) and the technique of simultane- 
ously considering alternative explanations or 
“multiple working hypotheses” (Chamber- 
lin, 1890; Campbell & Stanley, 1966). In fact, 
the legal decision maker in an adversarial sys- 
tem is forced to consider at least two com- 
peting hypotheses proposed by the parties. 
In this sense, the judge has some marginal 
protection against the thoughtless hypothe- 
sis confirmation to which scientists occasion- 
ally fall prey. This is not to say that judges 
are immune from hypothesis-confirming bi- 
ases, only that at the beginning of the process 
they are forced to consider at least two rival 
hypotheses. 

Nonetheless, the judge and the scientist 
have different tools available to them, dif- 
ferent constraints, and different goals. Sci- 
ence demands no final decisions; it is an on- 
going process. If the evidence is murky, sci- 
entists can wait, can reserve judgment until 
they can conduct further research. And they 
can figure out what further research needs to 
be done to answer the question, and do it. 
Judges can neither reserve judgment nor go 
beyond the data presented in court, how- 
ever ambiguous those data might be. They 
cannot carry out further research, nor wait 
until others have done so; they must decide. 

And the judge’s decision, whether the ev- 
idence is conclusive or completely inade- 


final. The scientist’s conclusions are never fi- 
nal, always tentative. 

The judge must also decide for one side 
or the other; the scientist’s decision that the 
truth lies somewhere between the extreme 
points of view is typically not available to 
the judge. As I will argue, these role con- 
straints in legal reasoning encourage cate- 
gorical thinking and a corresponding distrust 
of probabilistic reasoning, overconfidence, 
and a strong dispositional bias in which 
situational factors and attributional biases 
are overlooked, and the idea of free will is 
preserved. 


Lack of Opportunity for Empirical Testing 


Scientists and judges must both decide be- 
tween competing explanations. But when 
scientists are trying to decide among rival 
hypotheses, or even when testing a single hy- 
pothesis, sooner or later they put the ques- 
tion to nature. They design a study that will 
create new information, information that is 
not already in the system, that will help them 
to answer the question and to move forward 
in the way they think about the issues. In 
legal reasoning, there is no empirical op- 
tion. Judges must work with the information 
given to them, and that information consists 
entirely of what other people have said and 
the judge’s own knowledge. Judges listen to 
testimony and arguments and read the law, 
scholarly works, and the opinions of other 
judges; they arrange and rearrange these el- 
ements, selecting, interpreting, and looking 
for a rule that “holds good for the matter at 
hand” (Llewellyn, 1930, p. 72). The conclu- 
sion that the judge finally reaches is not em- 
pirically tested and cannot be disconfirmed. 

Of course, the judge may consider empir- 
ical data as part of the factual evidence in a 
case. Most cases involve experts of one sort 
or another — some who present the results of 
diagnostic tests (e.g., of bullets, blood, dan- 
gerousness, mental illness, almost anything 
you can think of), some who present the re- 
sults of empirical work specifically related 
to the case (e.g., contamination of the jury 
pool through pretrial publicity, evidence of 
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tion policies), some who describe the results 
of general research that is germane to the 
issue (e.g., evidence that some substance in- 
creases the risk of cancer, or of factors affect- 
ing the reliability of eyewitness testimony). 
The legal realists would be pleased about this 
increasing prevalence of social science evi- 
dence in legal decision making, but the judge 
does not collect new evidence. 

The scientist is searching for truth. The 
judge wants to get the facts right, but that 
is not the whole task. The judge also wants 
to settle the dispute in a way that is consis- 
tent with the law and the decisions in pre- 
vious disputes and that is just. So it could 
be argued that the whole concept of an em- 
pirical test of the final decision is irrelevant, 
that there is no empirical test of justice. 
If two scientists make opposite predictions, 
someone will do a study to try to choose 
between them or otherwise clarify the ques- 
tion. If a judge makes a decision, it is fi- 
nal unless it is appealed. If it is appealed, 
the appellate court rarely re-examines the 
facts and certainly does not invite new evi- 
dence but decides whether the lower court 
made a legal (procedural) error (Mathieson 
& Gross, 2004). The final decision is the 
decision of the majority, and a five to four 
decision in the Supreme Court has the same 
precedential authority as a unanimous de- 
cision. When the Court is split four to four, 
the views of the ninth, “swing” Justice decide 
the case and can have precedential force — 
even if those views are quite idiosyncratic 
(e.g., Johnson v. Louisiana, 1972; Regents of 
the University of California v. Bakke, 1978). 


Need for an Immediate, Final Decision 


Unlike the judge, the scientist can reserve 
judgment and can say that, given the mud- 
dled state of the current evidence, there are 
many questions that we can’t answer yet and 
that further research is necessary. The judge 
has to decide, and usually he has to decide 
one way or the other, without the range of 
compromise solutions that are often avail- 
able to the scientist. Just as judges cannot 
create new information by conducting em- 


information before making a decision. 

When the courts use available scientific 
data in reaching a decision, this finality can 
be a source of frustration to scientific re- 
searchers. In 1970, the Supreme Court held 
that the size of a jury (six versus twelve 
members) does not affect its functioning 
(Williams v. Florida, 1970), and in 1972, it 
held that deliberation would be just as thor- 
ough in juries that were not required to 
reach a unanimous verdict as in those that 
were (Johnson v. Louisiana, 1972; Apodaca 
et al. v. Oregon, 1972). In the early 1970s, 
when these decisions were handed down, 
there was almost no research on the ef- 
fects of group size or the unanimity require- 
ment. Social scientists were stunned that 
such important decisions could be made on 
the basis of so little information, and a flood 
of studies and commentaries quickly fol- 
lowed, many of them suggesting that twelve- 
person, unanimous juries deliberate more 
thoroughly than six-person or nonunani- 
mous juries (Lempert, 1975; Saks & Ostrum, 
1975; Zeisel, 1971, on jury size; Hastie, 
Penrod, & Pennington, 1983, on unanimity). 
However, the Court had already held that 
neither the size of the jury nor the una- 
nimity requirement affected deliberations, 
and that six-person and nonunanimous ju- 
ries were constitutional. Although it is cer- 
tainly true that in science bad research can 
exert a baleful influence on the field for far 
longer than it should (because the finding is 
exciting, or because it is what people want to 
believe, or because the researcher is very fa- 
mous, or for various other reasons), it doesn’t 
have the same force as legal precedent. It 
is more acceptable and less costly for a sci- 
entist to reject a theory than for a judge 
to overturn a previous precedent. Authority 
matters in law; in science nothing enhances 
a career more than a convincing refutation 
of authority. 

Still, there have been cases in which 
the Supreme Court has expressed a more 
provisional, scientific point of view. In 
Witherspoon v. Illinois (1968) the Court 
had before it sketchy evidence based on 
three unpublished studies suggesting that 
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from juries in capital cases (the common 
practice known as “death qualification”) bi- 
ased the jury toward a guilty verdict, and 
so when a defendant’s life was at stake he 
would face a greater risk of conviction than 
he would if the prosecutor had not asked for 
the death penalty. The Court decided that 
the research was, as yet, “too tentative and 
fragmentary” to reject death-qualification as 
unconstitutional but that future data might 
justify such a move. From a scientific point 
of view, such a holding is far more accept- 
able than a holding that said, “We have re- 
viewed the evidence and we conclude that 
death-qualification does not create a bias and 
therefore is constitutional,” which would be 
analogous to the Williams holding on jury 
size. From a practical point of view, how- 
ever, leaving a question open invites more 
litigation, and if the practice later is found 
to be unconstitutional, there is the problem 
of retroactivity — that is of what to do about 
all those people who were convicted by bi- 
ased, death-qualified juries. 


Categorical Thinking, Lack of 
Compromise, and Certainty 


The need to decide the particular case one 
way or the other also pushes legal reasoning 
toward categorical thinking: A person is ei- 
ther sane (guilty) or insane (not guilty); an 
unfit parent (someone else gets the child) or 
fit (he or she may get the child); a future 
danger to society (execution permitted) or 
not (execution not permitted, barring other 
aggravating factors). Psychologists consider 
sanity, fitness, and dangerousness to be con- 
tinuous variables with no great gulf between 
the sane and the insane, the fit and the un- 
fit, the safe and the dangerous, and many 
intermediate cases. But a legal case has to be 
decided for one party or the other, and so 
variables that are continuous are forced to 
become dichotomous. Sometimes there are 
more than two categories (first-degree mur- 
der, second-degree murder, and manslaugh- 
ter), but a line must always be drawn. 

The fact that the decision must be 
categorical very likely exercises an influence 
on the process of legal reasoning itself. 


ble, and in an adversary system, the judge 
is faced with two attorneys, each making 
the strongest possible case for diametrically 
opposed outcomes and thus minimizing any 
ambiguities.° Experts may agree on most 
of the data in their field, but those are not 
the data that make for effective adversarial 
persuasion; thus, they are not likely to be 
presented in court, and the judge or jury is 
not likely to get a sense of how much con- 
sensus actually exists. The attorneys do their 
best to make every fact and every precedent 
fit their argument, trying to make it look 
as though the field is “impacted” (Kennedy, 
1986), with little room for doubt, and that 
everything about this case places it clearly 
on one side of the line. The combination of 
adversarial presentation and the need for a 
dichotomous decision may eventually make 
the legal reasoning of judges resemble that 
of advocates. The facts and law may begin by 
seeming to be a mass of contradictions, and 
the judge may be plagued by “the doubts 
and misgivings, the hope and fears” (Car- 
dozo, 1921, p. 167) common in significant 
enterprises that are fraught with uncertainty 
and ambiguity; however, judicial opinions 
almost never suggest that there was ever any 
uncertainty. Once the judge realizes which 
way he will probably decide the case and 
the rudiments of the justifications, “one of 
the effects... is a kind of tunnel vision: One 
is inside the strategy, sensitive to its internal 
economy, its history of trade-offs, attuned 
to developing it further but at least tem- 
porarily unable to imagine any other way to 
go” (Kennedy, 1986, p. 543). As in normal 
memory processes, strong pressures toward 
consistency and coherence arise, and the ar- 
guments and evidence that initially seemed 
to favor the other side evaporate. “This sense 
of unequivocal support for the one decision 
generates a sense of inevitability, of singular 
correctness” (Simon, 1998, p. 84), and judi- 
cial opinions are generally written as though 
all arguments support the conclusion, and 
there is no uncertainty whatever. Simon 
attributes this movement toward certainty 
to basic cognitive processes, and certainly 
this form of thinking is not unique to law; 
it is however exaggerated, I think, by the 


LEGAL REASONING 699 


adversaribir@wentaéedany : ites Asnbalienaticors no accident that psychiatrists and clinical 


little or no attention to the ambiguous, in- 
between facts and law) and by the necessity 
of always having to choose one side. 

The feeling that there must be a cer 
tain outcome, and that expressions of uncer- 
tainty by a judge are a sign of weakness or 
incompetence (Simon, 1998, p. 12) seem 
quite bizarre in a world in which the basic 
insights of the legal realists are widely ac- 
cepted. But it is real. Despite the fact that 
majority and dissenting justices are perfectly 
certain (so presumably either one side is 
dead wrong or there is some uncertainty), 
and despite the fact that everyone knows 
that as soon as the next case comes along 
“the legal materials lose their recently ac- 
quired character, and return to their ambigu- 
ous existence within the world of multiple 
meanings” (Simon, 1998, p. 127), nonethe- 
less certainty is still valued as some sort of 
mastery and uncertainty as a sign of indeci- 
siveness at best and incompetence at worst. 
The decision must be justified in terms of 
the law, and it would be dangerous, in law 
as in chess or sports, to suggest that the law 
itself is ambiguous. 


Mistrust of Probabilistic Thinking 
and Aggregate Data 


This concern with certainty and the need 
to make dichotomous judgments may help 
explain why judges and legal scholars 
are often uncomfortable with probabilistic 
statements and probabilistic data. Scientists 
regularly make explicit quantified probabil- 
ity judgments; lawyers and judges do not — 
certainly not about the ultimate issues. For 
example, they strongly resist placing a nu- 
merical value on the “reasonable doubt” 
standard: Is it 95 % certainty, 99% certainty? 
Jurors are generally just given the stock 
phrase, sometimes supplemented by other 
phrases, such as “to a moral certainty” or 
“firmly convinced.” 

This hesitation to consider probabilities is 
not unreasonable given that the judge has to 
make a yes or no decision about a particular 
individual. The judge’s task is more analo- 
gous to that of a doctor or clinical psycholo- 
gist than to that of a research scientist, and it 


psychologists had close ties to the legal sys- 
tem long before research psychologists did. 
Explaining (or predicting) the behavior of a 
specific individual in a specific set of circum- 
stances is not what most scientists do and not 
what statistics are designed for. Experts will- 
ing to testify to the exact probability that a 
given defendant will commit a future crime 
are viewed as charlatans by the scientific 
community. However, statistical probabilis- 
tic data may be quite useful in illuminating 
other questions that judges must consider, 
such as whether a company is guilty of dis- 
crimination in hiring or whether a particular 
drug causes birth defects. These questions 
are typically addressed with aggregate data 
in which the results of many different stud- 
ies involving many different people are pro- 
vided by an expert. Judges have become far 
more receptive to statistical, empirical, ag- 
gregate studies over the past fifty years, but 
there is still a core reluctance. Experts who 
testify about the factors affecting eyewitness 
reliability often have to overcome a certain 
judicial skepticism about the value of their 
testimony because they have not examined 
this particular eyewitness but are only talk- 
ing about the circumstances that affect most 
eyewitnesses most of the time. Large-scale 
studies of pervasive racial discrimination in 
capital sentencing (Baldus, Woodworth, & 
Pulaski, 1990; Gross & Mauro, 1989) were 
rejected by the Supreme Court in McCleskey 
vs. Kemp (1987) in part because the ap- 
pellant had not shown that the particular 
jury that tried McCleskey was influenced 
by racial bias. The Court held that in order 
to succeed with a claim of racial discrimi- 
nation, an appellant must prove either (1) 
“that the decision makers in his case acted 
with discriminatory purpose” [emphasis in 
original], or (2) “that the Georgia legislature 
enacted or maintained the death penalty 
statute because of an anticipated racially dis- 
criminatory effect” [emphasis in original] 
(McCleskey vs. Kemp, 1987, p. 1769). 


Free Will and the Dispositional Bias 


Aggregate data are threatening in another 
way; they imply that many people in the 


70O THE CAMBRIDGE HANDBOOK OF THINKING AND REASONING 


same circuntsteiesta thw: Helos /A4iittanacg tating Even when exceptions are made, they 


same way and thereby threaten the notion 
of autonomy and free will so deeply rooted 
in the minds of legal thinkers. The law sees 
behavior as caused by people’s beliefs, de- 
sires, and preferences. Ideas of free choice 
and free will are still fundamental to legal 
thinking and largely unquestioned. This em- 
phasis creates another source of tension be- 
tween law and the social sciences because 
social science takes a much more determin- 
istic point of view, emphasizing cultural, so- 
ciological, psychological, biological, and, es- 
pecially in psychology, situational forces on 
behavior (Ross and Nisbett, 1991). The fact 
that economics is the social science that has 
been most successful in law schools is not 
surprising given this model; of all the social 
sciences, economics is the one most wedded 
to a free choice theory of behavior. 

The law has developed a highly elaborate 
set of definitions of various degrees of per- 
sonal responsibility, including deliberation, 
intention, knowledge, recklessness, and neg- 
ligence, but has been relatively untouched 
by psychological research on attributional 
biases and particularly by the research on 
the dispositional bias (fundamental attribu- 
tion error) or by social psychological re- 
search demonstrating that situations play a 
far greater role than personal preferences 
and dispositions in determining people’s be- 
havior (Ross & Nisbett, 1991). When situa- 
tional forces are considered, such as in the 
concepts of necessity and duress, the situ- 
ations are generally so extreme as to be ir 
relevant to everyday life — a person breaks 
into a lonely cabin in a blizzard because 
he is freezing to death or signs a contract 
because someone is holding a gun to her 
head — and can be taken as the exceptions 
that prove the rule that the pervasive power 
of the situation in all aspects of our lives 
is largely ignored by the law (Hanson & 
Yosifon, 2003; Ross & Shestowsky, 2003). 
The validity of the concept of free will has 
in fact troubled a sprinkling of legal schol- 
ars for a century (Pound, Green, Hanson), 
and these doubts have occasionally influ- 
enced sentencing practices but have rarely 
affected the basic attribution of guilt or lia- 


generally are made on the basis of internal, 
dispositional factors (e.g., insanity, youth) 
and rarely on the basis of situational forces. 


Conclusions and Future Directions 


Legal reasoning is a form of expert reason- 
ing. Einstein argued that expert reasoning — 
in particular, scientific reasoning — is “noth- 
ing but a refinement of our everyday think- 
ing” (1936, in Bargmann [trans.] 1954, p. 
290). Like everyday problem solving and sci- 
entific reasoning, legal reasoning begins by 
examining a set of facts and figuring out 
what happened and why. Of course, some 
of the “facts” may be fictions, and the judge 
must decide which to believe and which to 
reject, but that is true of all natural prob- 
lem solving. Information is selected and re- 
jected as part of the process of creating a 
coherent story. 

It is the “refinements” that make one form 
of expert reasoning different from another. 
Like other forms of expert reasoning, the 
law has its own terminology, its own uni- 
verse of acceptable data, and its own rules. 
In law, the rules are more flexible than they 
are in some domains and more central than 
they are in most. They are more flexible than 
the rules of chess, for example, because in 
complex cases there are often many possible 
rules and precedents from which to choose, 
and both the facts and the rules can be inter- 
preted and reinterpreted in relation to each 
other until the judge is satisfied with the to- 
tal combination — satisfied with the fitness 
or coherence of the overall picture, and sat- 
isfied that the decision is just. 

The rules are more central in that every 
decision must be justified by explicit dis- 
cussion of the relevant rules: The rules are 
not just a framework for decision making; 
they are an essential part of the process. 
The sine qua non of empirical scientific re- 
search is a clear description of the research 
method. The judge has a mass of materials 
to work with, ranging from the incoherent, 
self-serving blabbering of a witness to the 
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itself, and the sine qua non of legal reason- 
ing is the explanation of why this decision 
is the right one (Schauer, 1995), an expla- 
nation ultimately expressed as argument. 
This explanation “is meant not only to jus- 
tify the judgment in terms of an authorita- 
tive past but to constitute an authority to 
be referred to in the future” (White, 1985, 
p. 240). 

Despite the major developments in le- 
gal scholars’ interpretations of legal reasoning 
over the past century and a half, legal rea- 
soning itself has not changed substantially, 
and it is unlikely to do so in the near future. 
Law is a socially defined and socially con- 
structed system that is generally seen as serv- 
ing its purposes well. Undoubtedly there will 
be further changes in the nature of the fac- 
tual evidence judges consider relevant with 
increasing attention to general scientific re- 
search, but the form of legal reasoning, the 
rules of the game, cannot change without 
major changes in the system itself, and there 
is no indication of any such changes in the 
near future. 
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Notes 


1. European civil law systems differ from com- 
mon law systems in many respects, such as a 
more active role for the trial court judge, less 
emphasis on precedent, and reconsideration of 
the facts at the appellate level. They are be- 
yond the scope of this chapter. 


Gordon (1984), Duncan Kennedy (1973), and, 
especially, Thomas C. Grey (1983). 

3. In the era of formalism, judges were men, so I 
refer to them as “he.” For the sake of balance, 
I refer to scientists as she. 


4. By this time, Holmes had been on the Supreme 


Court for many years, and Pound had become 
more conservative and more prosaic. 

5. Of course there are exceptions, and a brief de- 
scription like this one must always be, in some 
ways, a caricature. 


6. In actuality, compromise is pervasive in the 


legal system, because most civil cases are re- 
solved by settlement and most criminal cases 
by plea bargain. The study of legal reasoning, 
however, focuses on the small minority of cases 
that are litigated and decided by judges. 
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CHAPTER 29 


Scientific Thinking and Reasoning 


Kevin Dunbar 
Jonathan Fugelsang 


What Is Scientific Thinking 
and Reasoning? 


Scientific thinking refers to the mental 
processes used when reasoning about the 
content of science (e.g., force in physics), 
engaged in typical scientific activities (e.g., 
designing experiments), or specific types of 
reasoning that are frequently used in sci- 
ence (eg., deducing that there is a planet 
beyond Pluto). Scientific thinking involves 
many general-purpose cognitive operations 
that human beings apply in nonscientific do- 
mains such as induction, deduction, anal- 
ogy, problem solving, and causal reason- 
ing. These cognitive processes are covered 
in many chapters of this handbook (see 
Sloman & Lagnado, Chap. 5 on induction; 
Holyoak, Chap. 6 on analogy; Buehner and 
Cheng, Chap. 7 on causality; Evans, Chap. 
8 on deduction; Novick and Bassok, Chap. 
14 on problem solving; Chi and Ohllson, 
Chap. 16 on conceptual change). What dis- 
tinguishes research on scientific thinking 
from general research on cognition is that 
research on scientific thinking typically in- 


volves investigating thinking that has scien- 
tific content. A number of overlapping re- 
search traditions have been used to investi- 
gate scientific thinking. We cover the history 
of research on scientific thinking and the dif- 
ferent approaches that have been used, high- 
lighting common themes that have emerged 
over the past fifty years of research. 


A Brief History of Research 
on Scientific Thinking 


Science is often considered one of the hall- 
marks of the human species, along with 
art, music, and literature. Illuminating the 
thought processes used in science there- 
fore reveals key aspects of the human mind. 
The thought processes underlying scientific 
thinking have fascinated both scientists and 
nonscientists because the products of sci- 
ence have transformed our world and be- 
cause the process of discovery is shrouded 
in mystery. Scientists talk of the chance dis- 
covery, the flash of insight, the years of per- 
spiration, and the voyage of discovery. These 
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mental processes underlying the discovery 
process intriguing to cognitive scientists as 
they attempt to uncover what really goes 
on inside the scientific mind and how sci- 
entists really think. Furthermore, the ques- 
tions, “Can scientists be taught to think bet- 
ter, avoiding mistakes of scientific thinking?” 
and “Could the scientific process be auto- 
mated such that scientists are no longer nec- 
essary?” make scientific thinking a topic of 
enduring interest. One of the most com- 
pelling accounts of science that makes the 
reader want to understand science and why 
science is interesting recently appeared in 
the journal Popular Science. In this article, 
Charles Hirshberg discusses his mother, sci- 
entist Joan Feynman, and her scientific con- 
tributions as well as the difficulties of being a 
woman scientist. The following excerpt cap- 
tures the excitement and thrill that even a 
household encounter with science can gen- 
erate and that is thought to be at the root 
of many scientists’ desire to conduct science 
(Hirschberg, 2003). 


My introduction to chemistry came in 
1970, on a day when my mom was bak- 
ing challah bread for the Jewish New Year. 
I was about ten, and though I felt cooking 
was unmanly for a guy who played short- 
stop for Village Host Pizza in the Menlo 
Park, California, Little League, she had 
persuaded me to help. When the bread was 
in the oven, she gave me a plastic pill bot- 
tle and a cork. She told me to sprinkle a 
little baking soda into the bottle, then a lit- 
tle vinegar, and cork the bottle as fast as 
I could. There followed a violent and com- 
pletely unexpected pop as the cork flew off 
and walloped me in the forehead. Explod- 
ing food: I was ecstatic! “That's called a 
chemical reaction,” she said, rubbing my 
shirt clean. “The vinegar is an acid and the 
soda is a base, and that's what happens 
when you mix the two.” After that, I never 
understood what other kids meant when 
they said that science was boring. 


The cognitive processes underlying sci- 
entific discovery and day-to-day scientific 
thinking have been a topic of intense 
scrutiny and speculation for almost 400 


2000; Tweney, Doherty, & Mynatt, 1981). 
Understanding the nature of scientific think- 
ing has been an important and central issue 
not only for our understanding of science, 
but also for our understating of what it is to 
be human. Bacon’s Novumm Organum, in 
1620, sketched out some of the key features 
of the ways that experiments are designed 
and data interpreted. Over the ensuing 400 
years, philosophers and scientists vigorously 
debated the appropriate methods that scien- 
tists should use (see Giere, 1993). These de- 
bates over the appropriate methods for sci- 
ence typically resulted in the espousal of a 
particular type of reasoning method such as 
induction or deduction. It was not until the 
Gestalt psychologists began working on the 
nature of human problem solving during the 
1940s that experimental psychologists began 
to investigate the cognitive processes under- 
lying scientific thinking and reasoning. 

The Gestalt Psychologist Max Werthei- 
mer initiated the first investigations of sci- 
entific thinking in his landmark book, Pro- 
ductive Thinking (Wertheimer, 1945; see 
Novick & Bassok, Chap. 14). Wertheimer 
spent a considerable amount of time corre- 
sponding with Albert Einstein, attempting 
to discover how Einstein generated the con- 
cept of relativity. Wertheimer argued that 
Einstein had to overcome the structure of 
Newtonian physics at each step in his the- 
orizing and the ways that Einstein actually 
achieved this restructuring were articulated 
in terms of Gestalt theories. For a recent 
and different account of how Einstein made 
his discovery, see Galison (2003). We will 
see later how this process of overcoming al- 
ternative theories is an obstacle with which 
both scientists and nonscientists need to 
deal when evaluating and theorizing about 
the world. 

One of the first investigations of the cog- 
nitive abilities underlying scientific think- 
ing was the work of Jerome Bruner and his 
colleagues at Harvard (Bruner, Goodnow, & 
Austin, 1956). They argued that a key ac- 
tivity in which scientists engage is to deter- 
mine whether or not a particular instance 
is a member of a category. For example, a 
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stances undergo fission when bombarded by 
neutrons and which substances do not. Here, 
scientists have to discover the attributes that 
make a substance undergo fission. Bruner 
et al. (1956) saw scientific thinking as the 
testing of hypotheses and collecting of data 
with the end goal of determining whether 
something is a member of a category or not. 
They invented a paradigm in which people 
were required to formulate hypotheses and 
collect data that test their hypotheses. Us- 
ing this approach, Bruner et al. identified a 
number of strategies people use to formu- 
late and test hypotheses. They found that 
a key factor determining which hypothesis 
testing strategy people use is the amount of 
memory capacity the strategy takes up (see 
also Morrison, Chap. 19, on working mem- 
ory). Another key factor they discovered was 
that it is much more difficult for people to 
discover negative concepts (e.g., not blue) 
than positive concepts (e.g., blue). Although 
the Bruner et al. research is most com- 
monly thought of as work on concepts, they 
saw their work as uncovering a key compo- 
nent of scientific thinking. 

A second early line of research on scien- 
tific thinking was developed by Peter Wa- 
son and his colleagues. Like Bruner et al., 
Wason (1968) saw a key component of sci- 
entific thinking as being the testing of hy- 
potheses. Whereas Bruner et al. focused on 
the different types of strategies people use 
to formulate hypotheses, Wason focused on 
whether people adopt a strategy of trying 
to confirm or disconfirm their hypotheses. 
Using Popper’s (1959) theory that scien- 
tists should try and falsify rather than con- 
firm their hypotheses, Wason devised a de- 
ceptively simple task in which participants 
were given three numbers, such as 2-4-6, 
and were asked to discover the rule under- 
lying the three numbers. Participants were 
asked to generate other triads of numbers, 
and the experimenter would tell the partic- 
ipant whether the triad was consistent or 
inconsistent with the rule. They were told 
that when they were sure they knew what 
the rule was they should state it. Most par 
ticipants began the experiment by thinking 


two. They then attempted to confirm their 
hypothesis by generating a triad like 8-10- 
12, then 14-16-18. These triads are consis- 
tent with the rule and the participants were 
told yes, that the triads were indeed con- 
sistent with the rule. However, when they 
proposed the rule, even numbers increas- 
ing by two, they were told that the rule 
was incorrect. The correct rule was num- 
bers of increasing magnitude. From this re- 
search Wason concluded that people try and 
confirm their hypotheses, whereas norma- 
tively speaking, they should try and discon- 
firm their hypotheses. One implication of 
this research is that confirmation bias is not 
just restricted to scientists but is a general 
human tendency. 

It was not until the 1970s that a general 
account of scientific reasoning was proposed. 
Herbert Simon, often in collaboration with 
Allan Newell (e.g., Newell & Simon, 1972), 
proposed that scientific thinking is a form 
of problem solving. He proposed that prob- 
lem solving is a search in a problem space. 
Newell and Simon’s (1972) theory of prob- 
lem solving is discussed in many places in 
this volume, usually in the context of spe- 
cific problems (see especially Novick & Bas- 
sok, Chap. 14, on problem solving). Herbert 
Simon (1977), however, devoted consider- 
able time to understanding many different 
scientific discoveries and scientific reason- 
ing processes. The common thread in his re- 
search was that scientific thinking and dis- 
covery is not a mysterious magical process 
but a process of problem solving in which 
clear heuristics are used. Simon’s goal was to 
articulate the heuristics that scientists use in 
their research at a fine-grained level. He built 
many programs that simulated the process of 
scientific discovery and articulated the spe- 
cific computations that scientists use in their 
research (see subsequent section on compu- 
tational approaches to scientific thinking). 
Particularly important was Simon and Lea’s 
(1974) work demonstrating that concept for- 
mation and induction consist of a search in 
two problem spaces: a space of instances and 
a space of rules. This idea has been highly 
influential on problem-solving accounts of 
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the next section. 

Overall, the work of Bruner, Wason, and 
Simon laid the foundations for contempo- 
rary research on scientific thinking. Early 
research on scientific thinking is conve- 
niently summarized in Tweney, Doherty, 
and Mynatt’s 1981 book, On Scientific Think- 
ing, in which they sketched out many of the 
themes that have dominated research on sci- 
entific thinking over the past few decades. 
Other more recent books, such as Ronald 
Giere’s Cognitive Models of Science (1993); 
David Klahr’s Explaining Science (2000); Pe- 
ter Carruthers, Steven Stich, and Michael 
Siegal’s Cognitive Basis of Science (2002); and 
Gorman and colleagues’ Scientific and Tech- 
nical Thinking (2005) provide detailed anal- 
yses of different aspects of scientific discov- 
ery. In this chapter, we discuss the main ap- 
proaches that have been used to investigate 
scientific thinking. 

One of the main features of investigations 
of research on the scientific mind has been 
to take one aspect of scientific thinking that 
is thought to be important and investigate 
it in isolation. How does one go about in- 
vestigating the many different aspects of sci- 
entific thinking? Numerous methodologies 
have been used to analyze the genesis of sci- 
entific concepts, theories, hypotheses, and 
experiments. Researchers have used experi- 
ments, verbal protocols, computer programs, 
and analysis of particular scientific discover- 
ies. A recent development has been to inves- 
tigate scientists as they reason “live” (in vivo 
studies of scientific thinking) in their own 
laboratories (Dunbar, 1995, 2002). From a 
“thinking and reasoning” standpoint, the ma- 
jor aspects of scientific thinking that have 
been most actively investigated are prob- 
lem solving, analogical reasoning, hypothe- 
sis testing, conceptual change, collaborative 
reasoning, inductive reasoning, and deduc- 
tive reasoning. 


Scientific Thinking as Problem Solving 


One important goal for accounts of scien- 
tific thinking has been to provide an over- 


tific mind. One framework that has had a 
great influence in cognitive science is that 
scientific thinking and scientific discovery 
can be conceived as a form of problem solv- 
ing. Simon (1977) argued that both scientific 
thinking in general and problem solving in 
particular could be thought of as a search in 
a problem space (see Chapter 11). A prob- 
lem space consists of all the possible states 
of a problem and all the operations that a 
problem solver can use to get from one state 
to the next (see problem solving entry). Ac- 
cording to this view, by characterizing the 
types of representations and procedures peo- 
ple use to get from one state to another, it 
is possible to understand scientific thinking. 
Scientific thinking therefore can be charac- 
terized as a search in various problem spaces 
(Simon, 1977). Simon investigated a num- 
ber of scientific discoveries by bringing par- 
ticipants into the laboratory, providing the 
participants with the data to which a sci- 
entist had access, and getting the partici- 
pants to reason about the data and rediscover 
a scientific concept. He then analyzed the 
verbal protocols participants generated and 
mapped out the types of problem spaces in 
which the participants searched (e.g., Qin & 
Simon, 1990). Kulkarni and Simon (1988) 
used a more historical approach to uncover 
the problem-solving heuristics that Krebs 
used in his discovery of the urea cycle. Kulka- 
rni and Simon analyzed Krebs’s diaries and 
proposed a set of problem-solving heuristics 
that he used in his research. They then built a 
computer program incorporating the heuris- 
tics and biological knowledge that Krebs had 
before he made his discoveries. Of particular 
importance are the search heuristics the pro- 
gram uses such as the experimental proposal 
heuristics and the data interpretation heuris- 
tics built into the program. A key heuristic 
was an unusualness heuristic that focused 
on unusual findings and guided the search 
through a space of theories and a space of 
experiments. 

Klahr and Dunbar (1988) extended the 
search in a problem space approach and pro- 
posed that scientific thinking can be thought 
of as a search through two related spaces — 
an hypothesis space and an experiment 
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uses will have its own types of representa- 
tions and operators used to change the rep- 
resentations. Search in the hypothesis space 
constrains search in the experiment space. 
Klahr and Dunbar found that some partic- 
ipants move from the hypothesis space to 
the experiment space, whereas others move 
from the experiment space to the hypothesis 
space. These different types of searches lead 
to the proposal of different types of hypothe- 
ses and experiments. More recent work 
has extended the dual-space approach to 
include alternative problem-solving spaces, 
including those for data, instrumentation, 
and domain-specific knowledge (Schunn & 
Klahr, 1995, 1996; Klahr & Simon, 1999). 


Scientific Thinking as Hypothesis 
Testing 


Many researchers have regarded testing spe- 
cific hypotheses predicted by theories as one 
of the key attributes of scientific thinking. 
Hypothesis testing is the process of evalu- 
ating a proposition by collecting evidence 
regarding its truth. Experimental cognitive 
research on scientific thinking that specifi- 
cally examines this issue has tended to fall 
into two broad classes of investigations. The 
first class is concerned with the types of 
reasoning that lead scientists astray, block- 
ing scientific ingenuity (see also Sternberg, 
Chap. 15 on creativity). A large amount of 
research has been conducted on the poten- 
tially faulty reasoning strategies that both 
participants in experiments and scientists 
use such as considering only one favored hy- 
pothesis at a time and how this prevents 
scientists from making discoveries. The sec- 
ond class is concerned with uncovering the 
mental processes underlying the generation 
of new scientific hypotheses and concepts. 
This research has tended to focus on the use 
of analogy and imagery in science as well as 
the use of specific types of problem-solving 
heuristics (see also Holyoak, Chapter 6 
on analogy). 

Turning first to investigations of what di- 
minishes scientific creativity, philosophers, 


have devoted a considerable amount of re- 
search to “confirmation bias.” This occurs 
when scientists consider only one hypoth- 
esis (typically the favored hypothesis) and 
ignore alternative hypotheses or other po- 
tentially relevant hypotheses. This impor- 
tant phenomenon can distort the design of 
experiments, formulation of theories, and 
interpretation of data. Beginning with the 
work of Wason (1968) and as discussed pre- 
viously, researchers have repeatedly shown 
that when participants are asked to design 
an experiment to test a hypothesis, they pre- 
dominantly design experiments they think 
will yield results consistent with the hypoth- 
esis. Using the 2-4-6 task mentioned ear- 
lier, Klayman and Ha (1987) showed that 
in situations in which one’s hypothesis is 
likely to be confirmed, seeking confirmation 
is a normatively incorrect strategy, whereas 
when the probability of confirming one’s 
hypothesis is low, then attempting to con- 
firm ones hypothesis can be an appropri- 
ate strategy. Historical analyses by Tweney 
(1989) on the way that Faraday made his dis- 
coveries and experiments investigating peo- 
ple testing hypotheses have revealed that 
people use a confirm early—disconfirm late 
strategy: When people initially generate or 
are given hypotheses, they try to gather ev- 
idence that is consistent with the hypoth- 
esis. Once enough evidence has been gath- 
ered, people attempt to find the boundaries 
of their hypothesis and often try to discon- 
firm their hypotheses. 

In an interesting variant on the con- 
firmation bias paradigm, Gorman (1989) 
has shown that when participants are told 
there is the possibility of error in the data 
they receive, they assume any data incon- 
sistent with their favored hypothesis are at- 
tributable to error. The possibility of error 
therefore insulates hypotheses against dis- 
confirmation. This hypothesis has not been 
confirmed by other researchers (Penner & 
Klahr, 1996) but is an intriguing one that 
warrants further investigation. 

Confirmation bias is very difficult to over- 
come. Even when participants are asked 
to consider alternate hypotheses, they of- 
ten fail to conduct experiments that could 
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Tweney and his colleagues provide an excel- 
lent overview of this phenomenon in their 
classic monograph “On Scientific Think- 
ing” (1981). The precise reasons for this 
type of block are still widely debated. Re- 
searchers such as Michael Doherty have ar- 
gued that limitations in working memory 
make it difficult for people to consider more 
than one hypothesis. Consistent with this 
view, Dunbar and Sussman (1995) showed 
that when participants are asked to hold 
irrelevant items in working memory while 
testing hypotheses, participants are unable 
to switch hypotheses in the face of inconsis- 
tent evidence (see also Morrison, Chap. 19, 
on working memory). Although limitations 
of working memory are involved in the phe- 
nomenon of confirmation bias, even groups 
of scientists can display confirmation bias. 
The recent controversies over cold fusion 
are an example of confirmation bias. Here, 
large groups of scientists had other hypothe- 
ses available to explain their data but yet 
maintained their hypotheses in the face of 
other, more standard alternative hypotheses. 
Mitroff (1974) provides some interesting ex- 
amples of scientists at the National Aero- 
nautical and Space Administration demon- 
strating confirmation bias that highlights 
the roles of commitment and motivation in 
this process. 


Causal Thinking in Science 


Much of scientific thinking and scientific 
theory building pertains to the development 
of causal models between variables of inter- 
est. For example, does smoking cause cancer, 
Prozac relieve depression, or aerosol spray 
deplete the ozone layer? (See also Buehner & 
Cheng, Chap. 7, on causality.) Scientists and 
nonscientists alike are constantly bombarded 
with statements regarding the causal rela- 
tionship between such variables. How does 
one evaluate the status of such claims? What 
kinds of data are informative? How do sci- 
entists and nonscientists deal with data that 
are inconsistent with their theory? 


soning literature that is directly relevant to 
scientific thinking is the extent to which sci- 
entists and nonscientists are governed by the 
search for causal mechanisms (i.e., the chain 
of events that lead from a cause to an effect) 
versus the search for statistical data (i-e., how 
often variables co-occur). This dichotomy 
can be boiled down to the search for quali- 
tative versus quantitative information about 
the paradigm the scientist is investigating. 
Researchers from a number of cognitive psy- 
chology laboratories have found that peo- 
ple prefer to gather more information about 
an underlying mechanism than covariation 
between a cause and an effect (eg. Ahn 
et al., 1995). That is, the predominant strat- 
egy that students in scientific thinking simu- 
lations use is to gather as much information 
as possible about how the objects under in- 
vestigation work rather than collecting large 
amounts of quantitative data to determine 
whether the observations hold across mul- 
tiple samples. These findings suggest that a 
central component of scientific thinking may 
be to formulate explicit mechanistic causal 
models of scientific events. 

One place where causal reasoning has 
been observed extensively is when scientists 
obtain unexpected findings. Both historical 
and naturalistic research has revealed that 
reasoning causally about unexpected find- 
ings has a central role in science. Indeed, 
scientists themselves frequently state that a 
finding was attributable to chance or was un- 
expected. Given that claims of unexpected 
findings are such a frequent component of 
scientists’ autobiographies and interviews 
in the media, Dunbar (1995, 1997, 1999; 
Dunbar & Fugelsang, 2004; Fugelsang et al., 
2004) decided to investigate the ways that 
scientists deal with unexpected findings. In 
1991-1992 Dunbar spent one year in three 
molecular biology laboratories and one im- 
munology laboratory at a prestigious U.S. 
university. He used the weekly laboratory 
meeting as a source of data on scientific dis- 
covery and scientific reasoning. (This type of 
study he has called InVivo cognition). When 
he examined the types of findings the sci- 
entists made, he found that more than 50% 
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were unexpected and that these scientists 
had evolved a number of important strate- 
gies for dealing with such findings. One clear 
strategy was to reason causally about the 
findings: Scientists attempted to build causal 
models of their unexpected findings. This 
causal model building resulted in the exten- 
sive use of collaborative reasoning, analog- 
ical reasoning, and problem-solving heuris- 
tics (Dunbar, 1997; 2001). 

Many of the key unexpected findings 
that scientists reasoned about in the InVivo 
studies of scientific thinking were inconsis- 
tent with the scientists’ pre-existing causal 
models. A laboratory equivalent of the bi- 
ology labs therefore was to create a situa- 
tion in which students obtained unexpected 
findings that were inconsistent with their 
pre-existing theories. Dunbar and Fugelsang 
(2005; see also Fugelsang et al., 2004) ex- 
amined this issue by creating a scientific 
causal thinking simulation in which exper- 
imental outcomes were either expected or 
unexpected. (Dunbar [1995 | called this type 
of study of people reasoning in a cognitive 
laboratory InVitro cognition). They found 
that students spent considerably more time 
reasoning about unexpected findings than 
expected findings. Second, when assessing 
the overall degree to which their hypoth- 


esis was supported or refuted, participants 
spent the majority of their time consid- 
ering unexpected findings. An analysis of 
participants’ verbal protocols indicates that 
much of this extra time is spent formu- 
lating causal models for the unexpected 
findings. 

Scientists are not merely the victims 
of unexpected findings but plan for unex- 
pected events to occur. An example of the 
ways that scientists plan for unexpected con- 
tingencies in their day-to-day research is 
shown in Figure 29.1. Figure 29.1 is an ex- 
ample of a diagram in which the scientist is 
building causal models about the ways that 
human immunodeficiency virus (HIV) inte- 
grates itself into the host deoxyribonucleic 
acid (DNA) taken from a presentation at 
a lab meeting. The scientist proposes two 
main causal mechanisms by which HIV in- 
tegrates into the host DNA. The main event 
that must occur is that gaps in the DNA 
must be filled. In the left-hand branch of 
Diagram 2, he proposes a cellular mech- 
anism whereby cellular polymerase fills in 
gaps as the two sources of DNA integrate. 
In the right-hand branch, he proposes that 
instead of cellular mechanisms filling in the 
gaps, viral enzymes fill in the gap and join 
the two pieces of DNA. He then designs an 
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two causal mechanisms. Clearly, visual and 
diagrammatic reasoning is used here and is 
a useful way of representing different causal 
mechanisms (see also Tversky, Chap. 10 on 
visuospatial reasoning). In this case, the vi- 
sual representations of different causal paths 
are used to design an experiment and predict 
possible results. Thus, causal reasoning is a 
key component of the experimental design 
process. 

When designing experiments, scientists 
know that unexpected findings occur of- 
ten and have developed many strategies to 
take advantage of them (Baker & Dunbar, 
2000). Scientists build different causal mod- 
els of their experiments incorporating many 
conditions and controls. These multiple 
conditions and controls allow unknown 
mechanisms to manifest themselves. Rather 
than being the victims of the unexpected, 
the scientists create opportunities for unex- 
pected events to occur, and once these events 
do occur, they have causal models that al- 
low them to determine exactly where in the 
causal chain their unexpected finding arose. 
The results of these InVivo and InVitro stud- 
ies all point to a more complex and nuanced 
account of how scientists and nonscientists 
test and evaluate hypotheses. 


The Roles of Inductive and Deductive 
Thinking in the Scientific Mind 


One of the most basic characteristics of sci- 
ence is that scientists assume that the uni- 
verse that we live in follows predictable 
rules. Very few scientists in this century 
would refute the claim that the earth ro- 
tates around the sun, for example. Scien- 
tists reason from these rules using a variety 
of different strategies to make new scien- 
tific discoveries. Two frequently used types 
of reasoning strategies are inductive (see 
Sloman & Lagnado, Chap. 5) and deductive 
reasoning (see Evans, Chap. 8). In the case 
of inductive reasoning, a scientist may ob- 
serve a series of events and try to discover a 


covered, scientists can extrapolate from the 
rule to formulate theories of the observed 
and yet to be observed phenomena. One ex- 
ample is using inductive reasoning in the dis- 
covery that a certain type of bacterium is a 
cause of many ulcers (Thagard, 1999). In a 
fascinating series of articles, Thagard docu- 
ments the reasoning processes that Marshall 
and Warren went through in proposing this 
novel hypothesis. One key reasoning pro- 
cess was the use of induction by generaliza- 
tion. Marshall and Warren noted that almost 
all patients with gastric enteritis had a spi- 
ral bacterium in their stomachs and formed 
the generalization that this bacterium is the 
cause of many stomach ulcers. There are nu- 
merous other examples of induction by gen- 
eralization in science, such as Tycho Brahe 
induction about the motion of planets from 
his observations, Dalton’s use of induction in 
chemistry, and the discovery of prions as the 
source of mad cow disease. Many theories 
of induction have used scientific discovery 
and reasoning as examples of this important 
reasoning process. 

Another common type of inductive rea- 
soning is to map a feature of one member 
of a category to another member of a cate- 
gory. This is called categorical induction. This 
type of induction projects a known prop- 
erty of one item onto another item from the 
same category. Thus, knowing that the Rous 
Sarcoma virus is a retrovirus that uses RNA 
rather than DNA, a biologist might assume 
that another virus that is thought to be a 
retrovirus also uses RNA rather than DNA. 
Although research on this type of induction 
typically has not been discussed in accounts 
of scientific thinking, this type of induction 
is common in science. For an important con- 
tribution to this literature see Smith, Shafir, 
and Osherson (1993), and for a review of 
this literature see Heit (2000). 

Turning now to deductive thinking, many 
thinking processes to which scientists adhere 
follow traditional rules of deductive logic. 
These processes correspond to conditions in 
which a hypothesis may lead to, or is de- 
ducible to, a conclusion. Although they are 
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ductive arguments can usually be phrased as 
“syllogisms,” or as brief mathematical state- 
ments in which the premises lead to the con- 
clusion. Deductive reasoning is an extremely 
important aspect of scientific thinking be- 
cause it underlies a large component of how 
scientists conduct their research. By looking 
at many scientific discoveries, we can often 
see that deductive reasoning is at work. De- 
ductive reasoning statements all contain in- 
formation or rules that state an assumption 
about how the world works and a conclu- 
sion that would necessarily follow from the 
rule. A classic example that is still receiv- 
ing much scientific investigation today is the 
case of Planet X. In the early twentieth cen- 
tury, Percival Lowell coined the term “Planet 
X” when referring to any planet yet to be dis- 
covered. Around that time and continuing to 
this day, based on rather large residual orbital 
perturbations of Uranus and Neptune, many 
scientists are convinced there exists a yet to 
be discovered planet in our solar system. Be- 
cause it is assumed as fact that only large ob- 
jects that possess a strong gravitational force 
can cause such perturbations, the search for 
such an object ensued. Given Pluto’s rather 
meager stature, it has been dismissed as a 
candidate for these perturbations. We can 
apply these statements to deductive logic 
as follows: 


Premise 1: The gravitational force of large 
planetary bodies causes perturbations in or- 
bits of planetary bodies 

Premise 2: Uranus and Neptune have per- 
turbations in their orbits 

Conclusion: The gravitational force of a 
large planetary body influences the orbits 
of Uranus and Neptune 


Of course, the soundness of the logical de- 
duction is completely dependent on the 
accuracy of the premises. If the premises 
are correct, then the conclusion will 
be correct. 

Inductive and deductive reasoning, even 
by successful scientists, is not immune to 
error. Two classes of errors commonly found 
in deductive reasoning are context and con- 


people often make is to assume that con- 
ditional relationships are, in fact, bicondi- 
tional. The conditional statement “if some- 
one has AIDS then they also have HIV,” 
for example, does not necessarily imply that 
‘if someone has HIV then they also have 
AIDS.” This is a common error in deduc- 
tive reasoning that can result in logically in- 
correct conclusions being drawn. A common 
content error people often make is to modify 
the interpretation of a conclusion based on 
the degree to which the conclusion is plau- 
sible. Here, scientists may be more likely to 
accept a scientific discovery as valid if the 
outcome is plausible. You can see how this 
second class of errors in deductive logic can 
have profound implications for theory de- 
velopment. Indeed, if scientists are overly 
blinded by the plausibility of an outcome, 
they may fail to objectively evaluate the 
steps in their deductive process. 


The Roles of Analogy in Scientific 
Thinking 


One of the most widely mentioned rea- 
soning processes used in science is analogy. 
Scientists use analogies to form a bridge 
between what they already know and what 
they are trying to explain, understand, or dis- 
cover. In fact, many scientists have claimed 
that the use of certain analogies was instru- 
mental in their making a scientific discovery, 
and almost all scientific autobiographies and 
biographies feature an important analogy 
that is discussed in depth. Coupled with the 
fact that there has been an enormous re- 
search program on analogical thinking and 
reasoning (see Holyoak, Chap. 6), we now 
have a number of models and theories of 
analogical reasoning that show exactly how 
analogy can play a role in scientific discovery 
(see Gentner, Holyoak, & Kokinov, 2001). 
By analyzing the use of analogies in sci- 
ence, Thagard and Croft (1999), Nersessian 
(1999), Gentner and Jeziorski (1993), and 
Dunbar and Blanchette (2001) all have 
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pect of scientific discovery. 

Traditional accounts of analogy distin- 
guish between two components of analog- 
ical reasoning — the target and the source. 
The target is the concept or problem that 
a scientist is attempting to explain or solve. 
The source is another piece of knowledge 
that the scientist uses to understand the tar- 
get, or to explain the target to others. What 
the scientist does when he or she makes an 
analogy is to map features of the source onto 
features of the target. By mapping the fea- 
tures of the source onto the target, new fea- 
tures of the target may be discovered, or the 
features of the target can be rearranged so 
that a new concept is invented and a scien- 
tific discovery is made. A common analogy 
used with computers is to describe a harmful 
piece of software as a computer virus. Once 
a piece of software is called a virus, people 
can map features of biological viruses, such 
as they are small, spread easily, self-replicate 
using a host, and cause damage. Not only 
do people map a single feature of the source 
onto the target but also the systems of re- 
lations between features from the source to 
the target. They also make analogical infer- 
ences. If a computer virus is similar to a bi- 
ological virus, for example, an immune sys- 
tem can be created on computers that can 
protect computers from future variants of 
a virus. One of the reasons scientific anal- 
ogy is so powerful is that it can generate 
new knowledge such as the creation of a 
computational immune system having many 
of the features of a real biological immune 
system. This also leads to predictions that 
there will be newer computer viruses that 
are the computational equivalent of retro- 
viruses, which lack DNA or standard in- 
structions that will elude the computational 
immune system. 

The process of making an analogy in- 
volves a number of key steps — retrieval of 
a source from memory, aligning the features 
of the source with those of the target, map- 
ping features of the source onto those of 
the target, and possibly making of new infer- 
ences about the target. Scientific discoveries 
are made when the source highlights a 
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restructures the target into a new set of rela- 
tions. Interestingly, research on analogy has 
shown that participants do not easily use 
analogy (see Gentner et al., 1997; Holyoak 
& Thagard, 1995). Participants tend to fo- 
cus on the sharing of a superficial feature 
between the source and the target, rather 
than the relations among features. In his 
InVivo studies of science, Dunbar (1995, 
2001, 2002) investigated the ways that scien- 
tists use analogies while they are conducting 
their research and found that scientists use 
both relational and superficial features when 
they make analogies. The choice of whether 
to use superficial or relational features de- 
pends on their goals. If their goal is to fix 
a problem in an experiment, their analogies 
are based upon superficial features. If their 
goal is to formulate hypotheses, they focus 
on analogies based upon sets of relations. 
One important difference between scien- 
tists and participants in experiments is that 
the scientists have deep relational knowl- 
edge of the processes they are investigat- 
ing and can use that relational knowledge to 
make analogies. 

Analogies sometimes lead scientists and 
students astray. Evelyn Fox-Keller (1985) 
shows how an analogy between the pulsing 
of a lighthouse and the activity of the slime 
mold dictyostelium led researchers astray for 
a number of years. Likewise, the analogy 
between the solar system (the source) and 
the structure of the atom (the target) has 
been shown to be potentially misleading to 
students taking more advanced courses in 
physics or chemistry. The solar system anal- 
ogy has a number of misalignments to the 
structure of the atom, such as electrons be- 
ing repelled rather than attracted by each 
other and that electrons do not have individ- 
ual orbits like planets, but have orbit clouds 
of electron density. Furthermore, students 
have serious misconceptions about the na- 
ture of the solar system, which can com- 
pound their misunderstanding of the nature 
of the atom (Fischler & Lichtfield, 1992). Al- 
though analogy is a powerful tool in science, 
as is the case with all forms of induction, in- 
correct conclusions can be reached. 
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Conceptual Change in the 
Scientific Mind 


Many researchers have noted that an im- 
portant component of science is the gen- 
eration of new concepts and modification 
of existing ones. Scientific concepts, like all 
concepts, can be characterized as contain- 
ing representations of words, thoughts, ac- 
tions, objects, and processes. How does one’s 
knowledge of scientific concepts change over 
time? The large-scale changes that occur in 
conceptual structures have been labeled con- 
ceptual change (see Chi & Ohlsson, Chap. 
16; Nersessian, 2002; Thagard, 1992). The- 
ories of conceptual change focus on two 
main types of shifts. One is the addition 
of knowledge to a pre-existing conceptual 
structure. Here, there is no conflict between 
the pre-existing conceptual knowledge and 
the new information the student is acquir- 
ing. Such minor conceptual shifts are rela- 
tively easy to acquire and do not demand 
restructuring of the underlying representa- 
tions of scientific knowledge. The second 
type of conceptual shift is what is known as 
“radical conceptual change” (see Keil, 1999, 
and Nersessian, 1998, for reviews of this lit- 
erature). In this type of situation, it is nec- 
essary for a new conceptual system to be 
acquired that organizes knowledge in new 
ways, adds new knowledge, and results in 
a very different conceptual structure. This 
radical conceptual change is thought to be 
necessary for acquiring many new concepts 
in physics and is regarded as the major source 
of difficulty for students. The factors at the 
root of this conceptual shift view have been 
difficult to determine, although a number of 
studies in human development (Carey, 1985; 
Chi, 1992; Chi & Roscoe 2002), in the his- 
tory of science (Nersessian, 1998; Thagard, 
1992), and in physics education (Clement, 
1982; Mestre, 1991) give detailed accounts 
of the changes in knowledge representation 
that occur when people switch from one 
way of representing scientific knowledge to 
another. A beautiful example of concep- 
tual change is shown in Figure 29.2. This il- 
lustration is taken from the first edition of 


the ancient Greeks looking on in amaze- 
ment at an English hunter who shoots at a 
bird using Newton’s new method of fluxions. 
Clearly they had not undergone the concep- 
tual change needed to understand Newto- 
nian physics. 

One area in which students show great 
difficulty in understanding scientific con- 
cepts is in physics. Analyses of students 
changing conceptions, using interviews, ver- 
bal protocols, and behavioral outcome mea- 
sures indicate that large-scale changes in 
students’ concepts occur in physics educa- 
tion (see McDermott and Redish 1999 for 
a review of this literature). Following Kuhn 
(1962), researchers have noted that students’ 
changing conceptions are similar to the se- 
quences of conceptual changes in physics 
that have occurred in the history of science. 
These notions of radical paradigm shifts and 
ensuing incompatibility with past knowl- 
edge states have drawn interesting parallels 
between the development of particular sci- 
entific concepts in children and in the history 
of physics. 

Investigations of naive people’s under- 
standing of motion indicate that students 
have extensive misunderstandings of mo- 
tion. This naive physics research indicates 
that many people hold erroneous be- 
liefs about motion similar to a medieval 
“Impetus” theory (McCloskey, Caramazza, 
& Green, 1980). Furthermore, students ap- 
pear to maintain “Impetus” notions even af- 
ter one or two courses in physics. In fact, 
some authors have noted that students who 
have taken one or two courses in physics 
may perform worse on physics problems 
than naive students (Mestre, 1991). It is 
only after extensive learning that we see 
a conceptual shift from “Impetus” theo- 
ries of motion to Newtonian scientific the- 
ories. How one’s conceptual representation 
shifts from “naive” to Newtonian is a mat- 
ter of contention because some have argued 
that the shift involves a radical conceptual 
change, whereas others have argued that the 
conceptual change is not really complete. 
Kozhevnikov and Hegarty (2001) argue that 
much of the naive “Impetus” notions of 
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Newtonian principles even with extensive 
training in physics. They argue that such 
“Impetus” principles are maintained at an 
implicit level. Thus, although students can 
give the correct Newtonian answer to prob- 
lems, their reaction times to respond indicate 
they are also using impetus theories. 

Although conceptual changes are thought 
to be large-scale changes in concepts that 
occur over extensive periods of time, it has 
been possible to observe conceptual change 
using InVivo methodologies. Dunbar (1995) 
reported a major conceptual shift that oc- 
curred in immunologists, in which they ob- 
tained a series of unexpected findings that 
forced the scientists to propose a new con- 
cept in immunology that, in turn, forced the 
change in other concepts. The drive behind 
this conceptual change was the discovery of 
a series of different unexpected findings or 
anomalies that required the scientists to re- 
vise and reorganize their conceptual knowl- 
edge. Interestingly, this conceptual change 
was achieved by a group of scientists reason- 
ing collaboratively, rather than by one scien- 
tist working alone. Different scientists tend 
to work on different aspects of concepts, and 
also different concepts, that, when put to- 
gether, lead to a rapid change in entire con- 
ceptual structures. 

Overall, accounts of conceptual change 
in individuals indicate it is, indeed, similar 
to that of conceptual change in entire scien- 
tific fields. Individuals need to be confronted 
with anomalies that their pre-existing theo- 
ries cannot explain before entire conceptual 
structures are overthrown. However, re- 
placement conceptual structures have to be 
generated before the old conceptual struc- 
ture can be discarded. Often, people do 
not overthrow their naive conceptual the- 
ories and have misconceptions in many fun- 
damental scientific concepts that are main- 
tained across the lifespan. 


The Scientific Brain 


In this chapter, we have provided an 
overview of research into the workings of the 


how the scientific mind possesses many cog- 
nitive tools that are applied differently de- 
pending on the task at hand. Research in 
thinking and reasoning has recently been ex- 
tended to include a systematic analysis of the 
brain areas associated with scientific reason- 
ing using techniques such as functional mag- 
netic resonance imaging (fMRI), positron 
emission topography, and event related po- 
tentials. There are two main reasons for 
taking this approach. First, these approaches 
allow the researcher to look at the en- 
tire human brain, making it possible to see 
the many different sites involved in sci- 
entific thinking and to gain a more com- 
plete understanding of the entire range of 
mechanisms involved in scientific think- 
ing. Second, these brain-imaging approaches 
allow researchers to address fundamental 
questions in research on scientific thinking. 
One important question concerns the extent 
to which ordinary thinking in nonscientific 
contexts and scientific thinking recruit sim- 
ilar versus disparate neural structures of the 
brain. Dunbar (2002) proposed that scien- 
tific thinking uses the same cognitive mech- 
anisms all human beings possess, rather than 
being an entirely different type of thinking. 
He has proposed that in scientific thinking, 
standard cognitive processes are used, but 
are combined in ways that are specific to a 
particular aspect of science or a specific dis- 
cipline of science. By comparing the results 
of brain imaging investigations of scientific 
thinking with brain imaging studies of non- 
scientific thinking, we can see both whether 
and when common versus dissociated brain 
sites are invoked during different cognitive 
tasks. This approach will make it possible to 
articulate more clearly what scientific think- 
ing is and how it is both similar to and differ- 
ent from the nonscientific thinking typically 
examined in the cognitive laboratory (also 
see Goel, Chap. 20). 

Considering the large arsenal of cognitive 
tools researchers have at their disposal, de- 
termining the neurological underpinning of 
scientific thinking becomes mainly a mat- 
ter of dissecting the processes thought to 
be involved in the reasoning process and 
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Figure 29.2. Conceptual change in science: The ancient Greeks look on in 
amazement as a hunter uses Newtonian principles to shoot down a bird. This 
figure is taken from the frontispiece of his Method of Fluxions and Infinite Series; 
with its Application to the Geometry of Curve Lines. Frontispiece in Bodelian 


Library. 


conducting systematic experiments on these 
subprocesses. What might these subpro- 
cesses be? As the previous sections of 
this chapter show, scientific thinking in- 
volves many cognitive capabilities including, 
but not limited to, analogical reason- 
ing, casual reasoning, induction, deduction, 
and problem solving: These subprocesses 


undoubtedly possess common and distinct 
neural signatures. A number of cognitive 
neuroscientists recently examined problem 
solving (Fincham et al., 2002; Goel & 
Grafman, 1995; Colvin, Dunbar, & Graf- 
man, 2001), analogical reasoning (Wharton 
et al., 2000; Kroger et al., 2002), hypothe- 
sis testing (Fugelsang & Dunbar, in press), 
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Seger et al,, 2000), aad deductive reason- 
ing (Parsons & Osherson, 2001; Osherson 
et al., 1998). They all pointed to the role of 
the dorsolateral prefrontal/parietal network 
for tasks requiring these higher level cogni- 
tive capacities. It is important to note that 
this brain network has been implicated in 
tasks that are highly demanding on attention 
and working-memory. 

One question cognitive neuroscience in- 
vestigations of scientific thinking are be- 
ginning to address is the neurological 
underpinnings of conceptual change. Using 
fMRI to investigate students who have and 
who have not undergone conceptual change 
in scientific areas, it is possible to uncover the 
neural changes that accompany conceptual 
change. Fugelsang and Dunbar (submitted) 
have found shifts from ventral pathways to 
dorsal pathways in the brain when students 
shift from naive impetus theories of motion 
to Newtonian theories of motion. These cog- 
nitive neuroscience investigations reveal the 
ways that knowledge is organized in the sci- 
entific brain and provide detailed accounts of 
the nature of the representation of scientific 
knowledge. 

The extent to which these processes are 
lateralized in the right or left hemisphere 
is a matter of recent debate, especially as 
it pertains to inductive and deductive rea- 
soning. Hemispheric differences in scientific 
deductive thinking potentially can be quite 
revealing about the nature of the represen- 
tations of the scientific mind. For exam- 
ple, recent cognitive neuroscience research 
can provide important new insights into 
one of the most fundamental questions that 
has perplexed many scientists for decades — 
namely, whether complex scientific think- 
ing processes, such as deductive and induc- 
tive reasoning, are represented in terms of 
linguistic or visual-spatial representations. 
Anecdotal claims are equivocal as to the na- 
ture of such representations. When think- 
ing about scientific concepts and devising 
theoretical explanations for phenomena, for 
example, scientists may verbally represent 
their theories in text or visually represent 


qHAGAiaNatythenties in graphical models. More often 


than not, scientific theories are represented 
in both modalities to some degree. 

Based on what we know about hemi- 
spheric differences in the brain, there are 
several clear predictions about how spatial 
and verbal thinking styles would be repre- 
sented in the brain. If scientific thinking were 
predominantly based on verbal or linguistic 
representations, for example, we would ex- 
pect activations of the basic language neu- 
ral structures such as the frontal and inferior 
temporal regions in the left hemisphere. If 
scientific thinking were predominately based 
on visual-spatial representations, one would 
expect activation of the basic perception 
and motor control neural structures such 
as those found in the parietal and occipital 
lobes, particularly in the right hemisphere. 
To date, findings from research on this issue 
have been quite mixed. Goel and colleagues 
(e.g., Goel et al., 1998; Goel Chap. 20) have 
found significant activations for deductive 
reasoning to occur predominantly in the left 
hemisphere. Parsons and Osherson (2001) 
using a similar, but different, task of deduc- 
tive reasoning, found that such tasks recrui- 
ted resources predominantly from the right 
hemisphere. 

Much research has been conducted to de- 
termine the cause of these different results 
and Goel (Chap. 20) provides a detailed ac- 
count of recent research on the brain and 
deductive reasoning. One result regarding 
hemispheric differences important for stud- 
ies of scientific thinking is that of Roser et 
al., (in press). They conducted experimen- 
tal examinations of hemispheric differences 
in scientific causal thinking in a split-brain 
patient. They found that the patient’s right 
hemisphere was uniquely able to detect 
causality in perceptually salient events (i.e., 
colliding balls), whereas his left hemisphere 
was uniquely able to infer causality based 
on a more complex, not directly perceivable, 
chain of events. These data add to our grow- 
ing understanding of how the brain contains 
specialized neural structures that contribute 
to the interpretation of data obtained from 
the environment. The obvious experiments 
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scientists to think and reason naturally about 
their own theories versus theories from dif- 
ferent domains while being imaged. This 
would allow one to decompose the effects 
of scientific thinking and familiarity. Clearly, 
research on the scientific brain is about 
to begin. 


Computational Approaches to 
Scientific Thinking 


Along with recent brain imaging studies, 
computational approaches have provided a 
more complete account of the scientific 
mind. Computational models provide spe- 
cific detailed accounts of the cognitive pro- 
cesses underlying scientific thinking. Early 
computational work consisted of taking a 
scientific discovery and building computa- 
tional models of the reasoning processes 
involved in the discovery. Langley et al. 
(1987) built a series of programs that sim- 
ulated discoveries such as those of Coper- 
nicus and Stahl. These programs have vari- 
ous inductive reasoning algorithms built into 
them and, when given the data the scientists 
used, were able to propose the same rules. 
Computational models make it possible to 
propose detailed models of the cognitive 
subcomponents of scientific thinking that 
specify exactly how scientific theories are 
generated, tested, and amended (see Darden 
1997; Shrager & Langley, 1990, for accounts 
of this branch of research). More recently, 
the incorporation of scientific knowledge 
into the computer programs resulted in a 
shift in emphasis from using programs to 
simulate discoveries to building programs 
that help scientists make discoveries. A num- 
ber of these computer programs have made 
novel discoveries. For example, Valdes-Perez 
(1994) built systems for discoveries in chem- 
istry, and Fajtlowicz has done this in mathe- 
matics (Erdos, Fajtlowicz, & Staton, 1991). 
These advances in the fields of computer 
discovery have led to new fields, confer- 
ences, journals, and even departments that 


devised to search large databases in the 
hope of making new scientific discoveries 
(Langley, 2000, 2002). This process is com- 
monly known as “data mining.” Not until rel- 
atively recently has this technique proven vi- 
able because of recent advances in computer 
technology. An even more recent develop- 
ment in the area of data mining is the use 
of distributed computer networks that take 
advantage of thousands, or even millions, of 
computers worldwide to jointly mine data 
in the hope of making significant scientific 
discoveries. This approach has shown much 
promise because of its relative cost effec- 
tiveness. The most powerful supercomput- 
ers currently cost over 100 million dollars, 
whereas a distributed network server may 
cost only tens of thousands of dollars for 
roughly the same computational power. 

Another recent shift in the use of com- 
puters in scientific discovery is to have com- 
puters and people make discoveries together, 
rather than expecting computers to make 
an entire scientific discovery. Now, instead 
of using computers to mimic the entire sci- 
entific discovery process used by humans, 
computers can use powerful algorithms that 
search for patterns on large databases and 
provide the patterns to humans who can 
then use the output of these computers to 
make discoveries from the human genome 
to the structure of the universe. 


Scientific Thinking and Science 
Education 


Science education has undergone many 
changes over the past hundred years that 
mirrored wider changes in both education 
and society. In the early 1900s, science edu- 
cation was seen as a form of nature study — 
particularly in the kindergarten through 
eight grades. Each decade has seen a re- 
port on the need to improve science edu- 
cation. Starting in the 1930s, proponents of 
the progressive education movement began 
a movement that continues to this day. They 
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than just facts and should be taught meth- 
ods and general principles, as well as ways in 
which science relate to the child’s world. In 
1938, a report by the Progressive Education 
Association noted that the psychology of the 
learner should be at the core of science edu- 
cation, as well as making a link to children’s 
everyday lives. Various reports on science ed- 
ucation appeared over the ensuing years, but 
it was the launch of the Sputnik satellite in 
1957 that transformed science education in 
the United States. Seeing the Soviets launch 
a rocket before the United States galvanized 
the nation into training better scientists and 
identifying the brightest students. The net 
result for science education was that text- 
books were updated, a factually based cur 
riculum was maintained, and the notion of 
science as a voyage of discovery entered the 
popular imagination. By the 1980s, however, 
many cultural changes had occurred, and sci- 
ence students in the United States appeared 
to be falling behind those in other countries. 
Numerous reports by science teachers and 
scientists recommended widespread changes 
in the ways that science is taught. Most im- 
portant in these changes was the move to a 
constructivist view of education. According 
to this view, students construct their knowl- 
edge rather than being the passive recipients 
of scientific knowledge (see also Ritchhart & 
Perkins, Chap. 32, on teaching thinking). 
Beginning in the 1980s, a number of re- 
ports, often constructivist, stressed the need 
for teaching scientific thinking skills and not 
just methods and content. The addition of 
scientific thinking skills to the science cur 
riculum from kindergarten through adult- 
hood was a major shift in focus. Many of 
the particular scientific thinking skills em- 
phasized were covered in previous sections 
of this chapter, such as deductive and induc- 
tive thinking strategies. Rather than focusing 
on one particular skill, such as induction, re- 
searchers in education have focused on how 
the different components of scientific think- 
ing are put together in science. Furthermore, 
science educators have focused on situations 
in which science is conducted collabora- 
tively, rather than being the product of one 


ence education parallel changes in method- 
ologies used to investigate science, such as 
analyzing the ways that scientists think and 
reason in their laboratories. 

By looking at science as a complex, multi- 
layered, and group activity, many researchers 
in science education have adopted a con- 
structivist approach. This approach sees 
learning as an active rather than a passive 
process and proposes that students learn 
through constructing their scientific knowl- 
edge. The goal of constructivist science edu- 
cation often is to produce conceptual change 
through guided instruction in which the 
teacher or professor acts as a guide to dis- 
covery rather than the keeper of all the facts. 
One recent and influential approach to sci- 
ence education is the inquiry-based learning 
approach. Inquiry-based learning focuses on 
posing a problem or a puzzling event to stu- 
dents and asking them to propose a hypoth- 
esis that can be used to explain the event. 
Next, students are asked to collect data that 
test the hypotheses, reach conclusions, and 
then reflect upon both the original problem 
and the thought processes they used to solve 
the problem. Students often use computers 
that aid in their construction of new knowl- 
edge. The computers allow students to learn 
many of the different components of scien- 
tific thinking. For example, Reiser and his 
colleagues have developed a learning envi- 
ronment for biology in which students are 
encouraged to develop hypotheses in groups, 
codify the hypotheses, and search databases 
to test them (Reiser et al., 2001). 

One of the myths of science is the lone 
scientist toiling under a naked lightbulb, 
suddenly shouting “Eureka, I have made a 
discovery!” Instead, InVivo studies of scien- 
tists (e.g., Dunbar, 1995, 2002), historical 
analyses of scientific discoveries (Nersessian, 
1999), and InVivo studies of children learn- 
ing science at museums all point to collab- 
orative scientific discovery mechanisms as 
being one of the driving forces of science 
(Crowley et al., 2001). What happens during 
collaborative scientific thinking is that there 
is usually a triggering event, such as an unex- 
pected result or situation that a student does 
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bers of the group adding new information 
to the person’s representation of knowledge, 
often adding new inductions and deduc- 
tions that both challenge and transform 
the reasoner’s old representations of knowl- 
edge (Dunbar, 1998). This means that social 
mechanisms play a key component in fos- 
tering changes in concepts that have been 
ignored in traditional cognitive research but 
are crucial for both science and science edu- 
cation. In science education, there has been a 
shift to collaborative learning, particularly at 
the elementary level, but in university edu- 
cation, the emphasis is still on the individual 
scientist. Because many domains of science 
now involve collaborations across scientific 
disciplines, we expect the explicit teach- 
ing of collaborative science heuristics to 
increase. 

What is the best way to teach and 
learn science? Surprisingly, the answer to 
this question has been difficult to un- 
cover. Although there has been consider- 
able research on the benefits of using a 
particular way of learning science, few com- 
parative studies of different methods have 
been conducted. Following Seymour Pa- 
pert’s book MindStorms, for example, (1980) 
many schools moved to discovery learning 
in which children discover aspects of pro- 
gramming and mathematics through writ- 
ing their own computer programs in the 
LOGO programming language. This discov- 
ery learning approach, which thousands of 
schools have adopted, has been presented as 
an alternative to more didactic approaches 
to teaching and learning. By allowing stu- 
dents to discover principles on their own 
and to set their own goals, students are pur- 
ported to have deeper knowledge that trans- 
fers more appropriately. Although there is 
much anecdotal evidence on the benefits of 
discovery learning, only recently has a di- 
rect comparison of discovery learning with 
more traditional methods been conducted. 
Klahr and Nigam (2004) conducted a study 
of third and fourth grade children learning 
about experimental design. They found that 
many more children learned from direct in- 
struction than from discovery learning. Fur- 


ing children did not have richer or deeper 
knowledge than direct instruction children. 
This type of finding suggests that pure dis- 
covery learning, although intuitively appeal- 
ing, benefits only a few children and that 
guided discovery coupled with explicit in- 
struction is one of the most effective educa- 
tional strategies in science. 


Conclusions and Future Directions 


Although much is known regarding certain 
components of scientific thinking, much re- 
mains to be discovered. In particular, there 
has been little contact among cognitive, neu- 
roscience, social, personality, and motiva- 
tional accounts of scientific thinking. Clearly, 
the relations among these different aspects 
of scientific thinking need to be combined 
to produce a comprehensive picture of the 
scientific mind. One way to achieve this is 
by using converging multiple methodolo- 
gies as outlined previously, such as natu- 
ralistic observation, controlled experiments 
in the cognitive laboratory, and functional 
brain imaging techniques. Theoretical devel- 
opments into the workings of the scientific 
mind would greatly benefit from more un- 
constrained analyses of the neuroanatomical 
correlates of the scientific reasoning process. 
We, as scientists, are beginning to get a rea- 
sonable grasp of the inner workings of the 
subcomponents of the scientific mind (i.e., 
problem solving, analogy, induction) and sci- 
entific thought. However, great advances re- 
main to be made concerning how these pro- 
cesses interact so scientific discoveries can 
be made. Future research will focus on both 
the collaborative aspects of scientific think- 
ing and the neural underpinnings of the sci- 
entific mind. 


Acknowledgments 


The authors would like to thank the fol- 
lowing organizations: Dartmouth College, 
McGill University, The Spencer Foundation, 
The National Science Foundation, and the 


722 THE CAMBRIDGE HANDBOOK OF THINKING AND REASONING 


Engineering Reeexeskhatesbaiheps Hiteanianargnt achievement in the water jug task. Journal of 


funding research discussed in this chapter. 
The comments of Keith Holyoak, Vimla Pa- 
tel, and an anonymous reviewer were all 
helpful in making this a better chapter. 


References 


Ahn, W., Kalish, C. W., Medin, D. L., & Gelman, 
S. A. (2995). The role of covariation versus 
mechanism information in causal attribution, 
Cognition, 54, 299-352. 

Azmitia, M. A. & Crowley, K. (2001). The 
rhythms of scientific thinking: A study of col- 
laboration in an earthquake microworld. In K. 
Crowley, C. Schunn, & T. Okada (Eds.) De- 
signing for Science: Implications from Everyday, 
Classroom, and Professional Settings. Mawah, 
NJ: Erlbaum. 

Bacon, F. (1620/1854). Novum Organum. (B. 
Monatgue, ‘Trans.). Philadelphia: Parry & 
McMillan. 

Baker, L. M., & Dunbar, K. (2000). Experimental 
design heuristics for scientific discovery: The 
use of baseline and known controls. Interna- 
tional Journal of Human Computer Studies, 53, 
335-349- 

Bruner, J. S., Goodnow, J. J., & Austin, G. A. 
(1956). A Study of Thinking. New York: Science 
Editions. 

Carey, S. (1985). Conceptual Change in Child- 
hood. Cambridge, MA: MIT Press. 

Carruthers, P., Stich, S., & Siegal, M. (2002). 
The Cognitive Basis of Science. New York: Cam- 
bridge University Press. 

Chi, M. (1992). Conceptual change within and 
across ontological categories: Examples from 
learning and discovery in science. In R. Giere 
(Ed.), Cognitive Models of Science. Minneapolis: 
University of Minnesota Press. 

Chi, M. T. H., & Roscoe, R. D. (2002). The pro- 
cesses and challenges of conceptual change. In 
M. Limon, & L. Mason (Eds.), Reconsidering 
Conceptual Change: Issues in Theory and Prac- 
tice (pp 3-27). The Netherlands: Kluwer Aca- 
demic Publishers. 

Clement, J. (1982). Students’ preconceptions in 
introductory mechanics. American Journal of 
Physics, 50, 66-71. 

Colvin, M. K., Dunbar, K., & Grafman, J. (2001). 
The effects of frontal lobe lesions on goal 


Cognitive Neuroscience, 13, 1129-1147. 

Darden, L. (1997). Strategies for discovering 
mechanisms: Schema instantiation, modular 
subassembly, forward chaining/backtracking. 
Proceedings of the 1997 Biennial Meeting of the 
Philosophy of Science Association. 


Dunbar, K. (1995). How scientists really reason: 
Scientific reasoning in real-world laborato- 
ries. In R. J. Sternberg, & J. Davidson (Eds.), 
Mechanisms of Insight. Cambridge, MA: MIT 
Press. 


Dunbar, K. (1997). How scientists think: Online 
creativity and conceptual change in science. 
In T. B. Ward, S. M. Smith, & S. Vaid (Eds.), 
Conceptual Structures and Processes: Emergence, 
Discovery and Change. Washington, DC: APA 


Press. 


Dunbar, K. (1998). Problem solving. In W. 
Bechtel, & G. Graham (Eds.). A Companion to 
Cognitive Science. London, UK: Blackwell. 


Dunbar, K. (1999). The scientist InVivo: How sci- 
entists think and reason in the laboratory. In L. 
Magnani, N. Nersessian, & P. Thagard (Eds.), 
Model-based Reasoning in Scientific Discovery. 
New York Plenum. 


Dunbar, K. (2001). The analogical paradox: Why 
analogy is so easy in naturalistic settings, yet 
so difficult in the psychology laboratory. In D. 
Gentner, K. J. Holyoak, & B. Kokinov, Analogy: 
Perspectives from Cognitive Science. Cambridge, 
MA: MIT Press. 


Dunbar, K. & Blanchette, I. (2001). The in- 
vivo/invitro approach to cognition: the case of 
analogy. Trends in Cognitive Sciences, 5, 334- 
339- 

Dunbar, K. (2002). Science as category: Implica- 
tions of InVivo science for theories of cognitive 
development, scientific discovery, and the na- 
ture of science. In P. Caruthers, S. Stich, & M. 
Siegel (Eds.), Cognitive Models of Science. New 
York: Cambridge University Press. 


Dunbar, K., & Fugelsang, J. (2004). Causal think- 
ing in science: How scientists and students in- 
terpret the unexpected. To appear in M. E. 
Gorman, A. Kincannon, D. Gooding, & R. D. 
Tweney (Eds.), New Directions in Scientific and 
Technical Thinking. Hillsdale, NJ: Erlbaum. 

Dunbar, K. & Sussman, D. (1995). Toward a cog- 
nitive account of frontal lobe function: Simu- 
lating frontal lobe deficits in normal subjects. 
Annals of the New York Academy of Sciences, 
769, 289-304. 


SCIENTIFIC THINKING AND REASONING 723 


Erdos, P., PRES@RAA ER ays eNtas GG WEianBEpCOrGoel, V., & Grafman, J. (1995). Are the frontal 


gree sequences in the triangle-free graphs. Dis- 
crete Mathematics, 92 (91), 85-88. 

Fincham, J. M., Carter, C. S., van Veen, V., 
Stenger, V. A., & Anderson, J. R. (2002). Neu- 
ral mechanisms of planning: A computational 
analysis using event-related fMRI. Proclama- 
tions of the National Academy of Science, 99, 
3346-3351. 

Fischler, H., & Lichtfeldt, M. (1992). Modern 
physics and students conceptions. International 
Journal of Science Education, 14, 181-190. 

Fox-Keller, E. (1985). Reflections on Gender and 
Science. New Haven, CT: Yale University Press. 

Fugelsang, J., & Dunbar, K. (submitted). How the 
brain uses theory to interpret data. 

Fugelsang, J., & Dunbar, K. (in preparation). How 
the brain learns physics. 

Fugelsang, J., Stein, C., Green, A., & Dunbar, K. 
(2004). Theory and data interactions of the sci- 
entific mind: Evidence from the molecular and 
the cognitive laboratory. Canadian Journal of 
Experimental Psychology, 58, 132-141. 

Galilei, G. (1638/1991). A. de Salvio, & H. Crew 
trans. Dialogues Concerning Two New Sciences. 
Amherst, NY: Prometheus Books. 

Galison, P. (2003). Einstein’s Clocks, Poincaré's 
Maps: Empires of Time. New York, NY: W. W. 
Norton. 

Gentner, D., Brem, S., Ferguson, R. W,, 
Markman, A. B., Levidow, B. B., Wolff, P., et 
al. (1997). Analogical reasoning and concep- 
tual change: A case study of Johannes Kepler. 
The Journal of the Learning Sciences, 6(1), 3- 
40. 

Gentner, D., Holyoak, K. J., & Kokinoy, B. (2001). 
The Analogical Mind: Perspectives from Cognitive 
Science. Cambridge, MA: MIT Press. 

Gentner, D., & Jeziorski, M. (1993). The shift 
from metaphor to analogy in western science. 
In A. Ortony (Ed.), Metaphor and Thought (2nd 
ed., pp. 447-480). Cambridge, England: Cam- 
bridge University Press. 

Giere, R. (1993). Cognitive Models of Science. Min- 
neapolis, MN: University of Minnesota Press. 
Goel, V., & Dolan, R. J. (2000). Anatomical seg- 
regation of component processes in an induc- 
tive inference task. Journal of Cognitive Neuro- 

science, 12, 110-119. 

Goel, V., Gold, B., Kapur, S., & Houle, S. (1998). 
Neuroanatomical correlates of human reason- 
ing. Journal of Cognitive Neuroscience, 10, 293- 
302. 


lobes implicated in “planning” functions? In- 
terpreting data from the Tower of Hanoi. Neu- 
ropsychologia, 33, 623-642. 

Gorman, M. E. (1989). Error, falsification and 
scientific inference: An experimental inves- 
tigation. Quarterly Journal of Experimental 
Psychology: Human Experimental Psychology, 
41 A, 385-412. 

Gorman, M. E., Kincannon, A., Gooding, D., & 
Tweney, R. D. (2004). New Directions in Sci- 
entific and Technical Thinking. Lawrence Erl- 
baum. Hillsdale, NJ. 


Heit, E. (2000). Properties of inductive reasoning. 
Psychonomic Bulletin & Review, 7, 569-592. 
Hirschberg, C. (2003). My mother, the scientist. 
In R. Dawkins, & T. Folger (Eds.), The Best 
American Science and Nature Writing 2003. 

New York: Houghton Mifflin. 

Holyoak, K. J., & Thagard, P. (1995). Mental 
Leaps. Cambridge, MA: MIT Press. 

Keil, F C. (2999). Conceptual change. In R. 
Wilson., & F. Keil (Eds.) The MIT Encyclope- 
dia of Cognitive Science, Cambridge, MA: MIT 
press. 

Klahr, D. (2000). Exploring Science: The Cognition 
and Development of Discovery Processes. Cam- 
bridge, MA: MIT Press. 

Klahr, D., & Dunbar, K. (1988). Dual space search 
during scientific reasoning. Cognitive Science, 
12, 1-48. 

Klahr, D., & Nigam, M. (2004). The equivalence 
of learning paths in early science instruction: 
Effects of direct instruction and discovery 
learning Psychological Science, 15, 661-667. 

Klahr, D., & Simon, H. (1999). Studies of sci- 
entific discovery: Complementary approaches 
and convergent findings. Psychological Bulletin, 
125, 524-543. 

Klayman, J., & Ha, Y. (1987). Confirmation, dis- 
confirmation, and information in hypothesis 
testing. Psychological Review, 94, 211-228. 

Kozhevnikov, M., & Hegarty, M. (2001). Impe- 
tus beliefs as default heuristic: Dissociation be- 
tween explicit and implicit knowledge about 
motion. Psychonomic Bulletin and Review, 8, 
439-453- 

Kroger, J. K., Sabb, EF W, Fales, C. L., 
Bookheimer, S. Y., Cohen, M. S., & Holyoak, K. 
J. (2002). Recruitment of anterior dorsolateral 
prefrontal cortex in human reasoning: A para- 
metric study of relational complexity. Cerebral 
Cortex, 12, 477-485. 


724 THE CAMBRIDGE HANDBOOK OF THINKING AND REASONING 


Kuhn, T. (196 @Reneeee 
tions. Chicago: University of Chicago Press. 
Kulkarni, D., & Simon, H. A. (1988). The pro- 
cesses of scientific discovery: The strategy of 
experimentation. Cognitive Science, 12, 139- 


176. 


bio 


Langley, P. (2000). Computational support of 
scientific discovery. International Journal of 
Human-Computer Studies, 53, 393-410. 

Langley, P. (2002). Lessons for the computational 
discovery of scientific knowledge. In the Pro- 
ceedings of the First International Workshop on 
Data Mining Lessons Learned. 

Langley, P., Simon, H. A., Bradshaw, G. L., & 
Zytkow, J. M. (1987). Scientific Discovery: Com- 
putational Explorations of the Creative Processes. 
Cambridge, MA: MIT Press. 

McCloskey, M., Caramazza, A., & Green, B. 
(1980). Curvilinear motion in the absence of 
external forces: Naive beliefs about the motion 
of objects. Science, 210, 1139-1141. 

McDermott, L. C., & Redish, L. (1999). Research 

letter on physics education research. American 

Journal of Psychics, 67,755. 

Mestre, J. P. (2991). Learning and instruction in 

pre-college physical science. Physics Today, 44, 

56-62. 

Mitroff, I. (1974). The Subjective Side of Science. 

Amsterdam: Elsevier. 

Nersessian, N. (1998). Conceptual change. In W. 

Bechtel, & G. Graham (Eds.). A Companion to 

Cognitive Science. London, UK: Blackwell. 

Nersessian, N. (1999). Models, mental mod- 
els, and representations: Model-based reason- 
ing in conceptual change. In L. Magnani, N. 
Nersessian, & P. Thagard (Eds.). Model-based 
Reasoning in Scientific Discovery. New York: 
Plenum. 


Nersessian, N. J. (2002). The cognitive basis 
of model-based reasoning in science In P. 
Carruthers, S. Stich, & M. Siegal (Eds). The 
Cognitive Basis of Science. New York: Cam- 
bridge University Press. 

Newton, I. (1736). Method of Fluxions and Infinite 
Series; with its Application to the Geometry of 
Curve Lines. Frontispiece in Bodleian Library. 

Newell, A., & Simon, H. A. (1972). Human Prob- 
lem Solving. Oxford, UK: Prentice-Hall. 

Osherson, D., Perani, D., Cappa, S., Schnur, T., 
Grassi, F., & Fazio, FE. (1998). Distinct brain 
loci in deductive versus probabilistic reasoning. 
Neuropsychologia, 36, 369-376. 


if ttipRAREiaNary Raper, S. (1980). Mindstorms: Children Com- 


puters and Powerful ideas. New York: Basic 
Books. 

Parsons, L. M., & Osherson, D. (2001). New ev- 
idence for distinct right and left brain systems 
for deductive versus probabilistic reasoning. 
Cerebral Cortex, 11, 954-965. 

Penner, D. E., & Klahr, D. (1996). When to trust 
the data: Further investigations of system er- 
ror in a scientific reasoning task. Memory and 
Cognition, 24(5), 655-668. 

Popper, K. R. (1959). The Logic of Scientific Dis- 
covery. London, UK: Hutchinson. 

Qin, Y., & Simon, H. A. (1990). Laboratory repli- 
cation of scientific discovery processes. Cogni- 
tive Science, 14, 281-312. 

Reiser, B. J., Tabak, I., Sandoval, W. A., Smith, B., 
Steinmuller, F, & Leone, T. J., (2001). BGuILE: 
Stategic and conceptual scaffolds for scientific 
inquiry in biology classrooms. In S. M. Carver, 
& D. Klahr (Eds.). Cognition and Instruction: 
Twenty Five Years of Progress. Mahwah, NJ: 
Erlbaum. 

Roser, M., Fugelsang, J., Dunbar, K., Corballis, P., 
& Gazzaniga, M. (Submitted). Causality per- 
ception and inference in the split-brain. 

Schunn, C. D., & Klahr, D. (1995). A 4-space 
model of scientific discovery. In the Proceedings 
of the 17th Annual Conference of the Cognitive 
Science Society. Pittsburgh, PA. 

Schunn, C. D., & Klahr, D. (1996). The prob- 
lem of problem spaces: When and how to go 
beyond a 2-space model of scientific discov- 
ery. Part of symposium on building a theory of 
problem solving and scientific discovery: How 
big is N in N-space search? In the Proceedings 
of the 18th Annual Conference of the Cognitive 
Science Society. San Diego, CA. 

Seger, C., Poldrack, R., Prabhakaran, V., Zhao, 
M., Glover, G., & Gabrieli, J. (2000). Hemi- 
spheric asymmetries and individual differences 
in visual concept learning as measured by func- 
tional MRI. Neuropsychologia, 38, 1316-1324. 

Shrager, J., & Langley, P. (1990). Computational 
Models of Scientific Discovery and Theory For- 
mation. San Mateo, CA: Morgan Kaufmann. 

Simon, H. A. (1977). Models of Discovery. Dor- 
drecht, The Netherlands: D. Reidel. 

Simon, H. A., & Lea, G. (1974). Problem solving 
and rule induction. In H. Simon (Ed.), Models 
of Thought (pp. 329 — 346). New Haven, CT: 
Yale University Press. 


SCIENTIFIC THINKING AND REASONING 725 


Smith, E. Brdsieste € by ietps Ug aiiibnarynt omweney, R. D., Doherty, M. E., & Mynatt, 


Similarity, plausibility, and judgments of prob- 
ability. Cognition. Special Issue: Reasoning and 
Decision Making, 49, 67-96. 

Thagard, P. (1992). Conceptual Revolutions. Cam- 
bridge, MA: MIT Press. 


Thagard, P. (1999). How Scientists Explain 
Disease. Princeton: Princeton University 
Press. 


Thagard, P., & Croft, D. (1999). Scientific 
discovery and technological innovation: Ul- 
cers, dinosaur extinction, and the program- 
ming language Java. In L. Magnani, N. Ners- 
essian, & P. Thagard (Eds.). Model-based 
Reasoning in Scientific Discovery. New York: 
Plenum. 


Tweney, R. D. (1989). A framework for the cog- 
nitive psychology of science. In B. Gholson, A. 
Houts, R. A. Neimeyer, & W. Shadish (Eds.), 
Psychology of Science and Metascience. Cam- 
bridge, MA: Cambridge University Press. 


C. R. (1981). On Scientific Thinking. New York: 
Columbia University Press. 

Valdes-Perez, R. E. (1994). Conjecturing hidden 
entities via simplicity and conservation laws: 
Machine discovery in chemistry. Artificial In- 
telligence, 65 (2), 247-280. 

Wason, P. C. (1968). Reasoning about a rule. 
Quarterly Journal of Experimental Psychology, 
20, 273-281. 

Wertheimer, M. (1945). Productive Thinking. 
New York: Harper. 

Wharton, C., & Grafman, J. (1998). Reasoning 
and the human brain. Trends in Cognitive Sci- 
ence. 2, 54-59. 

Wharton, C. M., Grafman, J., Flitman, S. S., 
Hansen, E. K., Brauner, J., Marks, A., et 
al. (2000). Toward neuroanatomical models 
of analogy: A positron emission tomography 
study of analogical mapping. Cognitive Psychol- 
O8Y, 40, 173-197. 


Préentatedyy: Inttps /iAatitianaegteom 


Prevertete bby: Inttas /4etitiianargatom 


CHAPTER 30 


Thinking and Reasoning in Medicine 


Vimla L. Patel 
José F. Arocha 
Jiajie Zhang 


What Is Medical Reasoning? 


Medical reasoning describes a form of qual- 
itative inquiry that examines the cogni- 
tive (thought) processes involved in making 
medical decisions. Clinical reasoning, med- 
ical problem solving, diagnostic reasoning, 
and decision making are all terms used in 
a growing body of literature that examines 
how clinicians make clinical decisions. Med- 
ical cognition refers to studies of cognitive 
processes, such as perception, comprehen- 
sion, decision making (see LeBoeuf & Shafir, 
Chap. 11), and problem solving (see Novick 
& Bassok, Chap. 14) in medical practice itself 
or in tasks representative of medical prac- 
tice. These studies use subjects who work in 
medicine, including medical students, physi- 
cians, and biomedical scientists. The study 
of medical reasoning has been the focus of 
much research in cognitive science and artifi- 
cial intelligence in medicine. Medical reason- 
ing involves an inferential process for making 
diagnostic or therapeutic decisions or un- 
derstanding the pathology of a disease pro- 
cess. On the one hand, medical reasoning is 


basic to all higher-level cognitive processes 
in medicine such as problem solving and 
medical text comprehension. On the other 
hand, the structure of medical reasoning it- 
self is the subject of considerable scrutiny. 
For example, the directionality of reason- 
ing in medicine has been an issue of con- 
siderable controversy in medical cognition, 
medical education, and artificial intelligence 
(AI) in medicine. Conventionally, we can 
partition medical reasoning into clinical and 
biomedical or basic science reasoning. These 
are some of the central themes that consti- 
tute this chapter. 


Early Research on Medical Problem 
Solving and Reasoning 


Medical cognition is a subfield of cogni- 
tive science devoted to the study of cogni- 
tive processes in medical tasks. Studies of 
medical cognition include analyses of perfor- 
mance in “real world” clinical tasks as well 
as in experimental tasks. Understanding the 
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soning in order to promote more effective 
practices has been the subject of concern for 
nearly a century (Osler, 1906). 

Human information processing research 
typically has focused on the individual. The 
dual focus on in-depth task analysis and on 
the study of human performance is a central 
feature of a cognitive science approach. 

There have been two primary approaches 
to research investigating clinical reasoning in 
medicine — the decision — analytic approach 
and the information-processing or problem- 
solving approach. Decision analysis uses a 
formal quantitative model of inference and 
decision making as the standard of compari- 
son (Dowie & Elstein, 1988). It compares the 
performance of a physician with the mathe- 
matical model by focusing on reasoning “fal- 
lacies” and biases inherent in human clin- 
ical decision making (Leaper et al., 1972). 
In contrast, the information-processing ap- 
proach focuses on the description of cog- 
nitive processes in reasoning tasks and the 
development of cognitive models of perfor- 
mance, typically relying on protocol analysis 
(Ericsson and Simon, 1993) and other obser- 
vational techniques. 

Systematic investigations of medical ex- 
pertise began more than forty years ago with 
the research by Ledley and Lusted (1959) 
on clinical inquiries. They proposed a two- 
stage model of clinical reasoning involving 
a hypothesis-generation stage followed by 
a hypothesis-evaluation stage in which the 
latter stage was amenable to formal deci- 
sion analytic techniques. Probably the ear- 
liest empirical studies of medical reason- 
ing can be traced to the work of Rimoldi 
(1961), who conducted experimental studies 
of diagnostic reasoning contrasting students 
with medical experts in simulated problem- 
solving tasks. The results emphasized the 
greater ability of expert physicians to se- 
lectively attend to relevant information and 
to narrow the set of diagnostic possibilities 
(i.e., consider fewer hypotheses). As cogni- 
tive science came into prominence in the 
early 1970s spearheaded by the immensely 
influential work of Newell and Simon (1972) 
on problem solving, research in information- 


cally. Problem solving was conceived of as a 
search in a problem space in which a prob- 
lem solver was viewed as selecting an option 
(e.g., a hypothesis or an inference) or per- 
forming an operation (from a set of possible 
operations) in moving toward a solution or a 
goal state (e.g., diagnosis or treatment plan). 
(See Novick & Bassok, Chap. 14, for a discus- 
sion of problem solving.) This conceptual- 
ization had an enormous impact in both cog- 
nitive psychology and artificial intelligence 
research. It also led to rapid advances 
in medical reasoning and problem-solving 
research, as exemplified by the seminal 
work of Elstein, Shulman, & Sprafka (1978). 
They were the first to use experimen- 
tal methods and theories of cognitive sci- 
ence to investigate clinical competency. 
Their extensive empirical research led to 
the development of an elaborated model 
of hypothetico-deductive reasoning, which 
proposed that physicians reason by first gen- 
erating and then testing a set of hypothe- 
ses to account for clinical data (i.e., reason- 
ing from hypothesis to data). This model 
of problem solving had a substantial influ- 
ence on studies of both medical cognition 
and medical education. 

In the late 1970s and early 1980s, ad- 
vances into the nature of human expertise 
were paralleled by developments in medi- 
cal AI — particularly expert systems tech- 
nology. Artificial intelligence in medicine 
and medical cognition mutually influenced 
each other in a number of ways, including 
(1) providing a basis for developing formal 
models of competence in problem-solving 
tasks, (2) elucidating the structure of medi- 
cal knowledge and providing important epis- 
temological distinctions, and (3) character- 
izing productive and less-productive lines 
of reasoning in diagnostic and therapeutic 
tasks. Gorry (1973) conducted a series of 
studies comparing a computational model 
of medical problem solving with the actual 
problem-solving behavior of physicians. This 
analysis provided a basis for characterizing 
a sequential process of medical decision- 
making — one that differs in important re- 
spects from early diagnostic computational 
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and colleagues (1976) capitalized on some of 
the insights of Gorry’s earlier work and de- 
veloped the Present Illness Program, a pro- 
gram designed to take the history of a patient 
with edema. Several of the questions guiding 
this research, including the nature and orga- 
nization of expert knowledge, were of cen- 
tral concern to both developers of medical 
expert systems and researchers in medical 
cognition. The development and refinement 
of the program was partially based on studies 
of clinical problem solving. 

Medical expert consultation systems such 
as Internist (Miller, Pople, & Myers, 1984) 
and MYCIN (Shortliffe, 1976) introduced 
the ideas about knowledge-based reason- 
ing strategies across a range of cognitive 
tasks. MYCIN, in particular, had a sub- 
stantial influence on cognitive science. It 
contributed several advances (e.g., repre- 
senting reasoning under uncertainty) in the 
use of production systems as a represen- 
tation scheme in a complex knowledge- 
based domain. MYCIN also highlighted the 
difference between medical problem solv- 
ing and the cognitive dimensions of med- 
ical explanation. Clancey’s work (Clancey 
& Lefsinger, 1984, 1985) in GUIDON and 
NEOMYCIN was particularly influential in 
the evolution of models of medical cog- 
nition. Clancey endeavored to reconfigure 
MYCIN to employ the system to teach med- 
ical students about meningitis and related 
disorders. NEOMYCIN was based on a more 
psychologically plausible model of medical 
diagnosis. This model differentiated data- 
directed and hypothesis-directed reasoning 
and separated control knowledge from the 
facts upon which it operates. 

Feltovich and colleagues (Feltovich et al., 
1984), drawing on models of knowledge 
representation from medical AI, character- 
ized fine-grained differences in knowledge 
organization between subjects with differ- 
ent levels of expertise in the domain of 
pediatric cardiology. These differences ac- 
counted for subjects’ inferences about di- 
agnostic cues and evaluation of competing 
hypotheses. Patel and Groen (1986), incor- 
porating distinctions introduced by Clancey, 


gies of expert cardiologists as evidenced by 
their pathophysiological explanations of a 
complex clinical problem. The results indi- 
cated that subjects who accurately diagnosed 
the problem employed a forward-oriented 
reasoning strategy — using patient data to 
lead toward a complete diagnosis (i.e., rea- 
soning from data to hypothesis). In contrast, 
subjects who misdiagnosed or partially diag- 
nosed the patient problem used a backward 
reasoning strategy. These research findings 
presented a challenge to the hypothetico- 
deductive model of reasoning as espoused by 
Elstein, Shulman, & Sprafka (1978), which 
did not differentiate expert from nonexpert 
reasoning strategies. 

Much of the early research in the study 
of reasoning in domains such as medicine 
was carried out in laboratory or experimen- 
tal settings. In more recent times, a shift oc- 
curred toward examining cognitive issues in 
naturalistic medical settings, such as med- 
ical teams in intensive care units (Patel, 
Kaufman, & Magder, 1996), anesthesiolo- 
gists working in surgery (Gaba, 1992), nurses 
providing emergency telephone triage (Lep- 
rohon & Patel, 1995), and reasoning with 
technology by patients in the health care 
system (Patel et al., 2002). This research 
was informed by work in the area of dy- 
namic decision making (Salas & Klein, 2001), 
complex problem solving (Frensch & Funke, 
1995), human factors (Hoffman & Deffen- 
bacher, 1993; Vicente & Rasmussen, 1990), 
and cognitive engineering (Rasmussen, 
Pejtersen, & Goodstein, 1994). Such studies, 
conducted in the workplace, reshaped our 
views of human thinking by shifting the onus 
of cognition from being the unique province 
of the individual to being distributed across 
social and technological contexts. 


Models of Medical Reasoning 


The traditional view of medical reasoning 
has been to treat diagnosis as similar to 
the scientist’s task of making a discovery or 
engaging in scientific experimentation (see 
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with this view of science is the assump- 
tion that diagnostic inference follows a 
hypothetico-deductive process of reaching 
conclusions by testing hypothesis based on 
clinical evidence. From a cognitive perspec- 
tive, as we saw previously, this view of the 
diagnostic process in medicine was first pro- 
posed in the influential work of Elstein, 
Shulman, and Sprafka (1978). The view of 
medical reasoning as hypothetico-deductive 
has been challenged from various points, em- 
pirical research, and philosophical discourse, 
as we will see in the following section. 


Toward a Model of Reasoning in 
Medicine: Induction, Deduction, 
and Abduction 


It generally is agreed that there are two basic 
forms of reasoning. One is deductive reason- 
ing (see Evans, Chap. 8), which consists of 
deriving a particular valid conclusion from a 
set of general premises, and the other is in- 
ductive reasoning (see Sloman & Lagnado, 
Chap. 5), which consists of deriving a likely 
general conclusion from a set of particular 
statements. However, reasoning in the “real 
world” does not appear to fit neatly into 
these basic types. For this reason, a third 
form of reasoning has been recognized in 
which deduction and induction are com- 
bined. This was termed abductive reasoning 
by Peirce (1955). 

Basically, all theories of medical reasoning 
characterize diagnosis as an abductive, cycli- 
cal process of generating possible explana- 
tions (i.e., identification of a set of hypothe- 
ses that are able to account for the clinical 
case on the basis of the available data) and 
testing those explanations (i.e., evaluation of 
each generated hypothesis on the basis of its 
expected consequences) for the abnormal 
state of the patient at hand (Elstein, Shul- 
man, & Sprafka, 1978; Kassirer, 1989; Joseph 
& Patel, 1990; Ramoni et al., 1992). Tra- 
ditional accounts of medical reasoning de- 
scribed the diagnostic process in a way that 
is independent of the underlying structure 


simply make the assumption that some do- 
main of knowledge exists and that all of the 
hypotheses needed to explain a problem are 
available when the diagnostic process begins. 

Within this generic framework, various 
models of diagnostic reasoning may be 
constructed. Following Patel and Ramoni 
(1997), we could distinguish between two 
major models of diagnostic reasoning: heuris- 
tic classification (Clancey, 1985) and cover 
and differentiate (Eshelman, 1988). How- 
ever, these models can be seen as special 
cases of a more general model: the select 
and test model, in which the processes of 
hypothesis generation and testing can be 
characterized in terms of four types of in- 
ferences (Peirce, 1955) — abstraction, abduc- 
tion, deduction, and induction. The first two 
inference types drive hypothesis generation 
whereas the latter two types drive hypothe- 
sis testing. During abstraction, data are fil- 
tered according to their relevance for the 
problem solution and chunked in schemas 
representing an abstract description of the 
problem at hand (e.g., abstracting that an 
adult male with hemoglobin concentration 
less than 14 d/gl is an anemic patient). Fol- 
lowing this, hypotheses that could account 
for the current situation are related through 
a process of abduction characterized by a 
“backward flow” of inferences across a chain 
of directed relations that identify initial con- 
ditions from which the current abstract rep- 
resentation of the problem originates. This 
provides tentative solutions to the problem 
at hand by way of hypotheses. For example, 
knowing that disease A will cause symptom 
b, abduction will try to identify the explana- 
tion for b, and deduction will forecast that 
a patient affected by disease A will manifest 
symptom b: Both inferences are using the 
same relation along two different directions. 
These three types of reasoning in medicine 
are described in a paper by Patel and 
Ramoni (1997). 

In the testing phase, hypotheses are tested 
incrementally according to their ability to 
account for the whole problem, and de- 
duction serves to build up the possible 
world described by the consequences of each 
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tomarily regarded as a common way to eval- 
uate diagnostic hypotheses (Kassirer, 1989; 
Patel, Evans, & Kaufman, 1989; Joseph & Pa- 
tel, 1990; Patel, Arocha, & Kaufman, 1994, 
2001). As predictions are derived from hy- 
potheses, they are matched to the case 
through a process of induction in which a 
prediction generated from a hypothesis can 
be matched with one specific aspect of the 
patient problem. The major feature of in- 
duction, therefore, is the ability to rule out 
hypotheses, the expected consequences of 
which turn out to be not in agreement with 
the patient’s problem. This is because there 
is no way to logically confirm a hypothe- 
sis, but we can only disconfirm or refute it 
in the presence of contrary evidence. This 
evaluation process closes the testing phase 
of the diagnostic cycle. Moreover, it deter- 
mines which information is needed to dis- 
criminate among hypotheses and, therefore, 
which information has to be collected. 


Hypothesis Testing and 
Clinical Reasoning 


Although a model such as the one just pre- 
sented can be used to account for a large 
part of the medical diagnostic process, em- 
pirical literature points to various strategies 
of diagnostic reasoning that underscore the 
relative importance of deduction, induction, 
or abduction. In their seminal work, Elstein 
and colleagues (1978) studied the problem- 
solving processes of physicians by drawing 
on then-contemporary methods and theories 
of cognition. Their view of problem solving 
had a substantial influence on both stud- 
ies of medical reasoning and medical edu- 
cation. They were the first to use experi- 
mental methods and theories of cognitive 
science to investigate clinical competency. 
Their research findings led to the develop- 
ment of an elaborated model of hypothetico- 
deductive reasoning, which proposed that 
physicians reasoned by first generating and 
then testing a set of hypotheses to account 
for clinical data (i.e., reasoning from hy- 


a small set of hypotheses very early in the 
case, assoonas the first pieces of data became 
available. Second, they were selective in the 
data they collected, focusing only on the rel- 
evant data. Third, physicians made use of the 
hypothetico-deductive process, which con- 
sists of four stages — cue acquisition, hy- 
pothesis generation, cue interpretation, and 
hypothesis evaluation. Cues in the clinical 
case led to the generation of a few selected 
hypotheses, and each cue was interpreted 
as positive, negative, or noncontributory to 
each hypothesis generated. Then each hy- 
pothesis was evaluated for consistency with 
the cues. Using this framework, these in- 
vestigators were unable to find differences 
between superior physicians (as judged by 
their peers) and other physicians (Elstein, 
Shulman, & Sprafka, 1978). 


Forward-Driven and Backward-Driven 
Reasoning 


Later, Patel and Groen (1986) studied 
knowledge-based solution strategies of ex- 
pert cardiologists as evidenced by their 
pathophysiological explanations of a com- 
plex clinical problem. The results indi- 
cated that subjects who accurately diagnosed 
the problem employed a forward-oriented 
(data-driven) reasoning strategy — using pa- 
tient data to lead toward a complete diagno- 
sis (i.e., reasoning from data to hypothesis). 
This was in contrast to subjects who mis- 
diagnosed or partially diagnosed the patient 
problem, who tended to use a backward 
or hypothesis-driven reasoning strategy. The 
results of this study presented a challenge 
to the hypothetico-deductive model of rea- 
soning as espoused by Elstein and colleagues 
(1978), which did not differentiate expert 
from nonexpert reasoning strategies. 

A hypothesis for reconciling these seem- 
ingly contradictory results is that forward 
reasoning is used in clinical problems in 
which the physician has ample experi- 
ence. When reasoning through unfamil- 
iar or difficult cases, however, physicians 
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knowledge base does not support a pattern- 
matching process. To support this expla- 
nation, Patel, Groen, and Arocha (1990) 
looked for the conditions under which 
forward reasoning breaks down. Cardiol- 
ogists and endocrinologists were asked to 
solve diagnostic problems in both fields. 
They showed that under conditions of case 
complexity and uncertainty, the pattern 
of forward reasoning was disrupted. More 
specifically, the breakdown occurred when 
nonsalient cues in the case were tested 
for consistency against the main hypoth- 
esis, even in subjects who had generated 
the correct diagnosis. Otherwise, the re- 
sults supported previous studies in that sub- 
jects with accurate diagnoses used pure 
forward reasoning. 

If forward reasoning breaks down when 
case complexity is introduced, then experts 
and novices should reason differently be- 
cause routine cases for experts would not be 
so for less-than-expert subjects. Investigat- 
ing clinical reasoning in a range of contexts 
of varying complexity (Patel & Groen, 1991; 
Patel, Arocha, Kaufman, 1994), the authors 
found that novices and experts have differ- 
ent patterns of data-driven and hypothesis- 
driven reasoning. As before, experts used 
data-driven reasoning, which depends on 
the physician’s possessing a highly organized 
knowledge base about the patient’s disease, 
including sets of signs and symptoms. Fur- 
thermore, because of their extensive knowl- 
edge base and the high-level inferences they 
make, experts typically skip steps in their 
reasoning. In contrast, because of their lack 
of substantive knowledge or their inability to 
distinguish relevant from irrelevant knowl- 
edge, less-than-expert subjects (novices and 
intermediates) used more hypothesis-driven 
reasoning, resulting often in very complex 
reasoning patterns. Similar patterns of rea- 
soning have been found in other domains 
(Larkin et al., 1980). 

The fact that experts and novices rea- 
son differently suggests that they might 
reach different conclusions (e.g., decisions 
or understandings) when solving medical 
problems. Although data-driven reasoning 


in the absence of adequate domain knowl- 
edge because there are no built-in checks 
on the legitimacy of the inferences that a 
person makes. Pure data-driven reasoning is 
successful only in constrained situations in 
which one’s knowledge of a problem can re- 
sult in a complete chain of inferences from 
the initial problem statement to the prob- 
lem solution. In contrast, hypothesis-driven 
reasoning is slower and requires high mem- 
ory load, because one has to keep track of 
goals and hypotheses. It therefore is most 
likely to be used when domain knowledge 
is inadequate or the problem is complex. 
Hypothesis-driven reasoning is an exemplar 
of a weak method of problem solving in the 
sense that it is used in the absence of relevant 
prior knowledge and when there is uncer- 
tainty about problem solution. In problem- 
solving terms, strong methods engage knowl- 
edge, whereas weak methods refer to general 
strategies. Weak does not necessarily imply 
ineffectual in this context. 

Studies also showed that data-driven rea- 
soning can break down because of uncer- 
tainty (Patel, Groen, & Arocha, 1990). These 
conditions include the presence of “loose 
ends” in explanations in which some piece 
of information remains unaccounted for and 
isolated from the overall explanation. Loose 
ends trigger explanatory processes that work 
by hypothesizing a disease, for instance, and 
trying to fit the loose ends within it in 
a hypothesis-driven reasoning fashion. The 
presence of loose ends may foster learning 
as the person searches for an explanation for 
them. A medical student or a physician may 
encounter a sign or a symptom in a patient 
problem, for instance, and look for infor- 
mation that may account for the finding by 
searching for similar cases seen in the past, 
reading a specialized medical book, or con- 
sulting a domain expert. (See Chi & Ohls- 
son, Chap. 16, for a discussion of such com- 
plex forms of learning.) 

In some circumstances, however, the use 
of data-driven reasoning may lead to a 
heavy cognitive load. When students are 
given problems to solve while they are be- 
ing trained in the use of problem-solving 
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duces a heavy load on cognitive resources, 
which may diminish students’ ability to fo- 
cus on the task. The reason is that stu- 
dents have to share cognitive resources (e.g., 
attention, memory) between learning the 
problem-solving method and learning the 
content of the material. Research (Sweller, 
1988) suggests that when subjects use a strat- 
egy based on data-driven reasoning, they 
are more able to acquire a schema for the 
problem. In addition, other characteristics 
associated with expert performance were 
observed, such as a reduced number of 
moves to the solution. When subjects used 
a hypothesis-driven reasoning strategy, their 
problem-solving performance suffered. The 
study of medical reasoning has been summa- 
rized in a series of articles (e.g. Patel et al., 
1994; Patel, Kaufman, & Arocha, 2002) and 
papers in edited volumes (Clancey & Short- 
liffe, 1984; Szolovits, 1982). 


The Role of Similarity in Diagnostic 
Reasoning 


The fact that physicians make use of forward 
reasoning in routine cases suggests a type of 
processing that is fast enough to be able to 
lead to the recognition of a set of signs and 
symptoms in a patient and generate a diag- 
nosis based on such recognition. Most often 
this has been interpreted as a type of specific- 
to-general reasoning (e.g., reasoning from an 
individual case to a clinical schema or proto- 
type). However, consistent with the model 
of abductive reasoning, some philosophers 
(Schaffner, 1986) and empirical researchers 
(Norman & Brooks, 1997) have supported 
an alternative hypothesis that consists of 
specific-to-specific reasoning. That is, ex- 
perts also use knowledge of specific instances 
(e.g., particular patients with specific disease 
presentations) to interpret particular cases, 
rather than relying only on general clinical 
knowledge (Kassirer & Kopelman, 1990). 
Brooks and colleagues (Brooks, Norman, 
& Allen, 1991; Norman & Brooks, 1997) ar- 
gued that clinicians make use of specific in- 


clinical case. In such studies, mainly involv- 
ing visual diagnosis — based on data sources 
such as radiographs, dermatological slides, 
and electrocardiograms — specific similarity 
to previous cases accounts for about 30% 
of diagnoses made (see Goldstone & Son, 
Chap. 2; Medin & Rips, Chap. 3). Further 
more, errors made by experts in identifying 
abnormalities in images are affected by the 
prior history of the patient. That is, if the 
prior history of the patient mentioned a pos- 
sible abnormality, expert physicians more of- 
ten identified abnormalities in the images 
even when none existed, which also sup- 
ports the effect of specific past cases on the 
interpretation of a current case. 

In pursuing their explanation, Norman 
and colleagues (Norman and Brooks, 1997) 
argued against the hypothesis that expert 
physicians diagnose clinical cases by “ana- 
lyzing” signs and symptoms and developing 
correspondences among those signs, symp- 
toms, and diagnoses, as traditional cognitive 
research in medical reasoning suggests. They 
suggest instead the “nonanalytic” basis for 
medical diagnosis, in which diagnostic rea- 
soning is characterized by the unanalyzed 
retrieval of a similar case previously seen 
in medical practice to interpret the current 
case — a kind of exemplar-based or case- 
based reasoning. This discussion has its 
counterpart in the psychology of categoriza- 
tion, in which two accounts have been pro- 
posed — categorization works either by a re- 
liance on prototypes or by exemplars (Medin 
& Rips, Chap. 3). 

Exemplar-based thinking is certainly a 
fundamental aspect of human cognition. 
There is ample evidence of the conditions 
under which reasoning by analogy to previ- 
ous cases is used (Gentner & Holyoak, 1997; 
Holyoak & Thagard, 1997; see Holyoak, 
Chap. 6). Furthermore, given the complex- 
ity of natural reasoning in a highly dense 
knowledge domain such as medicine, it is 
highly likely that more than one type of rea- 
soning is actually employed. Seen in this 
light, the search for a single manner in 
which clinicians diagnose clinical problems 
may not be a reasonable goal. The inherent 
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of knowledge domains, situations, problems, 
and cases may call for the use of a variety of 
reasoning strategies, which is what, after all, 
the notion of abductive medical reasoning 
tries to formalize (Patel & Ramoni, 1997). 
Alongside rule-based and prototype reason- 
ing, a model of clinical reasoning may allow 
for case-based, nonanalytical reasoning, in 
which recognizing similarity between partic- 
ulars may be the main cognitive mechanism. 
A reason for the variety of strategies used 
in actual diagnostic problems may be found 
in the inherent organization of medical 


knowledge. 


Reasoning and the Nature of 
Medical Knowledge 


Although a motivation for looking at med- 
ical reasoning was to establish its relation- 
ship with reasoning in other fields, such as 
science, the prevalent view in the philoso- 
phy of medicine (Blois, 1990) has been that 
medical knowledge has an extremely com- 
plex organization, requiring the use of dif- 
ferent reasoning strategies than those used in 
other more formal scientific disciplines, such 
as physics. Disciplines such as physics, chem- 
istry, and some subfields of biology, are said 
to be horizontally organized, which means 
these domains are characterized by the con- 
struction of causal relations among concepts 
and by the application of general principles 
to specific instances (Blois, 1990). By this, 
it is meant that such scientific fields are or 
ganized in a hypothetico-deductive manner 
in which particular statements are gener- 
ated from general statements and causality 
plays a major role. This type of reasoning, in 
which one connects one concept to another 
by forming causal networks, has been called 
horizontal reasoning (Blois, 1990). These 
philosophers argued that causal reasoning 
does not play such an important role in the 
medical domain. They argue, instead, that 
reasoning in medicine requires vertical think- 
ing. This kind of reasoning makes more use 
of the analogy than the reasoning typically 


view, the medical disciplines, notably clin- 
ical medicine, are organized vertically, and 
reasoning by analogy (see Holyoak, Chap- 
ter 6) plays a more important role than 
causal reasoning. Based on such a distinction, 
it has been argued that reasoning in the phys- 
ical sciences and reasoning in the biomedical 
sciences are of different types. 

In particular, it has been argued that rea- 
soning in the physical sciences, to some ex- 
tent, can be conceptualized as a “deductive 
systematization of a broad class of general- 
izations under a small number of axioms,” 
but this characterization cannot be applied 
to the biomedical sciences. The latter are 
characterized by what Shaffner (1986, p. 68) 
calls “a series of overlapping interleaved tem- 
poral models” that are based on familiarity 
with shared exemplars to a much greater 
degree than is necessary in the physical 
sciences. Shaffner’s characterization, unlike 
that of Blois, applies to both biomedical re- 
search and clinical medicine. In biomedical 
research, an organism such as a Drosophila, 
for instance, is used as an exemplar embody- 
ing a given disease mechanism that, by anal- 
ogy, applies to other organisms, including 
humans. In the clinical sciences, the patient 
is seen as an exemplar to which generaliza- 
tions based on multiple overlapping models 
are applied from diseases and the population 
of similar patients. 

In the empirical research on medical 
reasoning, the distinction between reason- 
ing from cases versus reasoning from pro- 
totypes has not been established. Medical 
knowledge consists of two categories of 
knowledge - clinical knowledge, includ- 
ing knowledge of disease processes and as- 
sociated findings; and basic science knowl- 
edge, incorporating subject matter such as 
biochemistry, anatomy, and physiology. Ba- 
sic science or biomedical knowledge is sup- 
posed to provide a scientific foundation for 
clinical reasoning. The conventional view 
is that basic science knowledge can be 
seamlessly integrated into clinical knowl- 
edge analogous to the way that learning the 
rules of the road can contribute to one’s 
mastery of driving a car. In this capacity, 
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Figure 30.1. Idealized representation of the “intermediate effect.” The straight line gives a commonly 
assumed representation of performance development by level of expertise. The curved, U-shaped, 
line represents the actual development from novice to expert. The y-axis may represent performance 
variables, such as the number of errors made, irrelevant concepts recalled, conceptual elaborations, or 
number of extraneous hypotheses generated in a variety of tasks. 


a particular piece of biomedical knowledge 
could be automatically elicited in a range of 
clinical contexts and tasks in more or less the 
same fashion. 


Knowledge Organization and 
Changes in Directionality 


Following Blois (i988) and Schaffner 
(1986), it can be argued that the way medical 
knowledge is organized can be a determinant 
factor explaining why experts do not use the 
hypothetico-deductive method of reasoning. 
Maybe the medical domain is too messy 
to allow its neat partitioning and deductive 
use of reasoning strategies. Although the the- 
ory of reasoning in medicine is basically a 
theory of expert knowledge, reaching the 
level of efficient reasoning of the expert clin- 
ician reflects the extended continuum of 
training and levels of reasoning performance 


(Thibodeau et al., 1989; Chi et al., 1989). 
This continuum also points to the partic- 
ular nature of medical knowledge and its 
acquisition. 

Changes have been described in this pro- 
cess that serve to characterize the various 
phases medical trainees go through to be- 
come expert clinicians. An important char- 
acteristic of this process is the intermediate 
effect. This refers to the fact that, although 
it seems reasonable to assume that perfor- 
mance improves with training or time on 
task, there appear to be particular transitions 
in which subjects exhibit a certain drop in 
performance. This is an example of what is 
referred to as nonmonotonicity in the devel- 
opmental literature (Strauss & Stavy, 1982) 
and is also observed in skill acquisition. The 
symptom is a learning curve or developmen- 
tal pattern that is shaped like a U or an 
inverted U, as illustrated in Figure 30.1. In 
medical expertise development, intermedi- 
ates’ performance reflects the degradation in 
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of knowledge through a time during which 
such knowledge is not well-organized and 
irrelevant associations abound in the inter 
mediate’s knowledge base. In contrast, the 
novice’s knowledge base is too sparse, con- 
taining very few associations, whereas the 
expert’s knowledge base is well pruned of 
the irrelevancies that characterize interme- 
diates. It should be noted that not all in- 
termediate performance is nonmonotonic; 
for example, on some global criteria such 
as diagnostic accuracy, there appears to be 
a steady improvement. 

The intermediate effect occurs with many 
tasks and at various levels of expertise. The 
tasks vary from comprehension of clinical 
cases and explanation of clinical problems 
to problem solving to generating laboratory 
data. The phenomenon may be attributable 
to the fact that intermediates have acquired 
an extensive body of knowledge but have 
not yet reorganized this knowledge in a 
functional manner. Intermediate knowledge 
therefore has a sort of network structure 
that results in considerable search, which 
makes it more difficult for intermediates to 
set up structures for rapid encoding and 
selective retrieval of information (Patel & 
Groen, 1991). In contrast, expert knowledge 
is finely tuned to perform various tasks, and 
experts can readily filter out irrelevant infor- 
mation using their hierarchically organized 
schemata. The difference is reflected in the 
structural organization of knowledge and the 
extent to which it is proceduralized to per 
form different tasks. 

Schmidt and Boshuizen (1993) reported 
that intermediate nonmonotonicity recall 
effects disappear by using short exposure 
times (about thirty seconds), which sug- 
gests that under time-restricted conditions, 
intermediates cannot engage in extraneous 
search. Whereas a novice’s knowledge base 
is likely to be sparse and an expert’s knowl- 
edge base is intricately interconnected, the 
knowledge base of an intermediate possesses 
many of the pieces of knowledge but lacks 
the extensive connectedness of an expert. 
Until this knowledge becomes further con- 


engage in unnecessary search. Whether this 
knowledge, painfully acquired during med- 
ical training, is really necessary for clinical 
reasoning has been a focus of intensive re- 
search and great debate. If expert clinicians 
do not explicitly use underlying biomedi- 
cal knowledge, does that mean that it is not 
necessary? Or could it be simply that this 
knowledge remains “dormant” until is really 
needed? This raises an important question of 
whether expert medical knowledge is deep 
or shallow. 


Causal Reasoning in Medicine 


The differential role of basic science knowl- 
edge (e.g., physiology and biochemistry) in 
solving problems of varying complexity and 
the differences between subjects at different 
levels of expertise (Patel et al., 1994) have 
been a source of controversy in the study of 
medical cognition (Patel & Kaufman, 1995) 
as well as in medical education and AI. 
As expertise develops, the disease knowl- 
edge of a clinician becomes more dependent 
on clinical experience, and clinical problem 
solving is increasingly guided by the use of 
exemplars and analogy and becomes less de- 
pendent on a functional understanding of 
the system in question. However, an in- 
depth conceptual understanding of basic sci- 
ence plays a central role in reasoning about 
complex problems and is also important in 
generating explanations and justifications for 
decisions. 

Researchers in AI were confronted with 
similar problems in extending the utility of 
systems beyond their immediate knowledge 
base. Biomedical knowledge can serve differ- 
ent functional roles depending on the goals 
of the system. Most models of diagnostic rea- 
soning in medicine can be characterized as 
being shallow. A shallow medical expert sys- 
tem (e.g., MYCIN and INTERNIST) reasons 
by relating observations to intermediate hy- 
potheses that partition the problem space 
and further by associating intermediate 
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is consistent with the way physicians appear 
to reason. 

There are other medical reasoning sys- 
tem models that propose a “deep” mode 
of reasoning as a main mechanism. Chan- 
drasekeran, Smith, & Sticklen, (1989) char- 
acterize a deep system as one that embod- 
ies a causal mental model of bodily function 
and malfunction, similar to the models used 
in qualitative physics (Bobrow, 1985). Sys- 
tems such as MD X-2 (Chandrasekeran et al., 
1989) or QSIM (Kuipers, 1987) have ex- 
plicit representations of structural compo- 
nents and their relations, the functions of 
these components (in essence their pur 
pose), and their relationship to behavi- 
oral states. 

To become licensed physicians, medical 
trainees undergo a lengthy training process 
that entails the learning of biomedical sci- 
ences, including biochemistry, physiology, 
anatomy, and others. The apparent contra- 
diction between this type of training and 
the absence of deep biomedical knowledge 
during expert medical reasoning has been 
pointed out. To account for such appar 
ent inconsistency, Boshuizen and Schmidt 
(1992) proposed a learning mechanism — 
knowledge encapsulation. Knowledge encap- 
sulation is a learning process that involves 
the subsumption of biomedical propositions 
and their interrelations in associative clusters 
under a small number of higher-level clini- 
cal propositions with the same explanatory 
power. Through exposure to clinical train- 
ing, biomedical knowledge presumably be- 
comes integrated with clinical knowledge. 
Biomedical knowledge can be “unpacked” 
when needed but is not used as a first line 
of explanation. 

Boshuizen and Schmidt (i992) cite a 
wide range of clinical reasoning and re- 
call studies that support this kind of learn- 
ing process. Of particular importance is the 
well-documented finding that with increas- 
ing levels of expertise, physicians produce 
explanations at higher levels of generality, 
using fewer and fewer biomedical concepts 
while producing consistently accurate re- 


accounted for as a stage in the encapsula- 
tion process in which a trainee’s network of 
knowledge has not yet become sufficiently 
differentiated, resulting in more extensive 
processing of information. 

Knowledge encapsulation provides an ap- 
pealing account of a range of developmen- 
tal phenomena in the course of acquiring 
medical expertise. The integration of ba- 
sic science in clinical knowledge is a rather 
complex process, however, and encapsula- 
tion is likely to be only part of the knowledge 
development process. Basic science knowl- 
edge plays a different role in different clini- 
cal domains. For example, clinical expertise 
in perceptual domains, such as dermatology 
and radiology, necessitates a relatively ro- 
bust model of anatomical structures that is 
the primary source of knowledge for diag- 
nostic classification. In other domains, such 
as cardiology and endocrinology, basic sci- 
ence knowledge has a more distant relation- 
ship with clinical knowledge. The miscon- 
ceptions evident in physicians’ biomedical 
explanations would argue against their hav- 
ing well-developed encapsulated knowledge 
structures in which basic science knowledge 
can easily be retrieved and applied when 
necessary. 

The results of research into medical prob- 
lem solving are consistent with the idea that 
clinical medicine and biomedical sciences 
constitute two distinct and not completely 
compatible worlds with distinct modes of 
reasoning and quite different ways of struc- 
turing knowledge (Patel, Arocha, & Kauf- 
man, 1994). Clinical knowledge is based 
on a complex taxonomy that relates dis- 
ease symptoms to underlying pathology. In 
contrast, biomedical sciences are based on 
general principles defining chains of causal 
mechanisms. Learning to explain how a set 
of symptoms is consistent with a diagnosis 
therefore may be very different from learn- 
ing how to explain what causes a disease. 
(See Buehner & Cheng, Chap. 7, for a dis- 
cussion of causal learning.) 

The notion of the progression of men- 
tal models (White & Frederiksen, 1990) has 
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characterizing the development of concep- 
tual understanding in biomedical contexts. 
Mental models are dynamic knowledge 
structures composed to make sense of expe- 
rience and to reason across spatial or tempo- 
ral dimensions (see Johnson-Laird, Chap. 9). 
An individual’s mental models provide pre- 
dictive and explanatory capabilities of the 
function of a given system. The authors 
employed the progression of mental mod- 
els to explain the process of understand- 
ing increasingly sophisticated electrical cir- 
cuits. This notion can be used to account 
for differences between novices and experts 
in understanding circulatory physiology, de- 
scribing misconceptions (Patel, Arocha, & 
Kaufman, i994), and explaining the gen- 
eration of spontaneous analogies in cau- 
sal reasoning. 

Running a mental model is a potentially 
powerful form of reasoning but it is also 
cognitively demanding. It may require an 
extended chain of reasoning and the use 
of complex representations. It is apparent 
that skilled individuals learn to circumvent 
long chains of reasoning and chunk or com- 
pile knowledge across intermediate states of 
inference (Chandrasekaran, 1994; Newell, 
1990). This results in shorter, more direct, 
inferences that are stored in long-term mem- 
ory and are directly available to be retrieved 
in the appropriate contexts. Chandrasekaran 
(1994) refers to this sort of knowledge as 
compiled causal knowledge. This term refers 
to knowledge of causal expectations that 
people compile directly from experience 
and partly by chunking results from previ- 
ous problem-solving endeavors. The goals 
of the individual and the demands of re- 
curring situations largely determine which 
pieces of knowledge get stored and used. 
When physicians are confronted with a simi- 
lar situation, they can employ this compiled 
knowledge in an efficient and effective man- 
ner. The development of compiled knowl- 
edge is an integral part of the acquisition 
of expertise. 

The idea of compiling declarative knowl- 
edge bears a certain resemblance to the idea 
of knowledge encapsulation, but the claim 


cess of compiling knowledge is not one of 
subsumption or abstraction, and the origi- 
nal knowledge (uncompiled mental model) 
may no longer be available in a similar 
form (Kuipers & Kassirer, 1984). The sec- 
ond difference is that mental models are 
composed dynamically out of constituent 
pieces of knowledge rather than prestored 
unitary structures. The use of mental mod- 
els is somewhat opportunistic and the learn- 
ing process is less predictable. The compi- 
lation process can work in reverse as well. 
That is to say, discrete cause-and-effect re- 
lationships can be integrated into a mental 
model as a student reasons about complex 
physiological processes. 


Errors and Medical Reasoning 


According to the report from the Institute 
of Medicine (Kohn, Corrigan, & Donaldson, 
1999), medical error is the eighth leading 
cause of death in the United States ahead 
of deaths attributable to motor vehicle acci- 
dents, breast cancer, or acquired immunod- 
eficiency syndrome. Cognitive mechanisms, 
such as mistakes of reasoning and decision 
making and action slips of skilled perfor- 
mance, are the major factors contributing 
to medical errors. A cognitive taxonomy is 
essential to understanding, explaining, and 
predicting medical errors and to develop- 
ing interventions to reduce medical errors. 
Based on the definition and preliminary tax- 
onomy by Reason (1990) and the action the- 
ory by Norman (1986), Zhang et al. (2004, 
in review) developed a cognitive taxonomy 
for human errors in medicine. 


A Cognitive Taxonomy of 
Medical Errors 


One critical step toward understanding the 
cognitive mechanisms of various errors in 
medical reasoning is to categorize the er- 
rors along cognitively meaningful dimen- 
sions. Reason (1990) defines human error as 
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in a planned sequence of mental or physical 
activities. He divides human errors into two 
major categories: (1) slips that result from 
the incorrect execution of a correct action 
sequence and (2) mistakes that result from 
the correct execution of an incorrect ac- 
tion sequence. Norman’s theory of action 
(Norman, 1986) decomposes a human ac- 
tivity into seven stages. 

Based on Reason’s definition of human 
error and Norman’s action theory, Zhang 
and colleagues developed a cognitive tax- 
onomy. Under this taxonomy, errors are di- 
vided into slips and mistakes, just like Rea- 
son’s two main categories. Then slips are 
divided into execution slips and evaluation 
slips. Execution slips include goal, intention, 
action specification, and action execution 
slips, whereas evaluation slips include per- 
ception, interpretation, and evaluation slips. 
Similarly, mistakes can be divided into exe- 
cution mistakes that include goal, intention, 
action specification, and action execution 
mistakes and evaluation mistakes that in- 
clude perception, interpretation, and evalu- 
ation mistakes. This taxonomy can cover ma- 
jor types of medical errors, because a medical 
error is a human error in an action and any 
action goes through the seven stages of the 
action cycle. Most reasoning and decision- 
making errors in medicine are under the cat- 
egory of mistakes in the taxonomy. They 
are attributable to incorrect or incomplete 
knowledge. 


Reasoning and Decision-Making 
Mistakes in Medicine 


In the cognitive taxonomy, goal and inten- 
tion mistakes are mistakes about declarative 
knowledge — knowledge about factual state- 
ments and propositions, such as “Motrin is 
a pain reliever and fever reducer.” Action 
specification mistakes and action execu- 
tion mistakes are mistakes about procedural 
knowledge — knowledge about procedures 
and rules, such as “give 1 tsp Motrin to a 
child per dosage up to 4 times a day if the 


of the child is 24-35 lbs.” 

Goal mistakes and intention mistakes 
are caused by many complex factors such 
as incorrect knowledge, incomplete knowl- 
edge, and misuse of knowledge; biases; 
faulty heuristics; and information overload. 
For example, neglect of base rate informa- 
tion could result in incorrect diagnosis of 
a disease. This is a well-documented find- 
ing in human decision making (Tversky & 
Kahneman, 1974; Kahneman & Frederick, 
Chap. 12). As another example, the goal of 
“treating the disease as pneumonia” could 
be a mistake if it is a misdiagnosis based on 
incomplete knowledge (e.g., without radio- 
graphic images). Intention mistakes can be 
caused by similar factors, such as the follow- 
ing example: A physician treating a patient 
with oxygen set the flow control knob be- 
tween one and two liters per minute, not 
realizing that the scale numbers represented 
discrete, rather than continuous, settings. As 
a result, the patient did not receive any oxy- 
gen. This is a mistake caused by incomplete 
knowledge. The use of heuristics is another 
common source of goal and intention mis- 
takes. A heuristic often used is the reliance 
on disease schemata during clinical diagno- 
sis. Disease schemata are knowledge struc- 
tures that have been formed from previ- 
ous experience with diagnosing diseases and 
contain information about relevant and ir- 
relevant signs and symptoms. When physi- 
cians and medical students diagnose pa- 
tients, they tend to rely on their schemata 
and base their reasoning on the apparent 
similarity of patient information with these 
schemata instead of a more objective anal- 
ysis of patient data. The schemata used in 
diagnosis often guide future reasoning about 
the patient, affecting what tests are run 
and how data are interpreted. Arocha and 
Patel (1995) found that medical students and 
trainees maintained their initial hypotheses, 
even if subsequent data were contradictory. 
Therefore, if the initial hypothesis is wrong, 
errors in diagnosis and treatment are likely 
to occur. Preliminary presentation of the pa- 
tient (e.g., signs and symptoms), then, be- 
comes very important because it can suggest 
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of schemata). 

Action specification and action execution 
mistakes are procedural mistakes that can 
be caused by many factors, such as lack of 
correct rules, overgeneralized application of 
good rules, misapplication of good rules, en- 
coding deficiencies in rules, and dissociation 
between knowledge and rules. Overgener- 
alized application of good rules, for exam- 
ple, can cause an error because the condi- 
tion part of a condition-action rule could be 
misidentified and mismatched, causing the 
firing of the action part of the rule. Proce- 
dural mistakes caused by encoding efficien- 
cies of action rules are usually attributable to 
the evolving nature of the rules and unfore- 
seeable conditions that cannot be encoded 
in the rules. A good rule may be misused 
because the user may have incorrect or in- 
complete knowledge about the condition of 
the rule in a specific context. The knowl- 
edge of a rule and the knowledge of how 
to use a rule are not always automatically 
linked without extensive practice. This dis- 
sociation, attributable to the lack of expe- 
rience and practiced skills, may also lead to 
action execution mistakes. 

Perception mistakes can be caused by 
expectation-driven processing. What we 
perceive is a function of the input and our 
expectations. This mechanism is what al- 
lows us to read sloppy handwriting or rec- 
ognize degraded images. However, our ex- 
pectations can also lead to misperceptions. 
Interpretation mistakes are the incorrect in- 
terpretation of feedback caused by incorrect 
or incomplete knowledge. Suppose, for in- 
stance, that an intravenous infusion pump, 
a device often used in critical care environ- 
ments to give medications, indicates readi- 
ness to begin infusing medications using a 
steady green light and indicates the infusion 
is in progress by flashing the green light. If 
the device user does not know the mean- 
ing of the steady green light, he or she 
may incorrectly interpret it as an indication 
that the infusion has begun. Generating dif- 
ferent interpretations and treatment proce- 
dures from the same evidence is another 
source of interpretation mistake. An action 
evaluation mistake occurs when incorrect 


judge the completion or incompletion of 
a goal erroneously. 


Medical Reasoning and 
Decision Research 


Decision making is central to medical activ- 
ity. Although health-care professionals are 
generally highly proficient decision mak- 
ers, their erroneous decisions have be- 
come the source of considerable public 
scrutiny (Kohn, Corrigan, & Donaldson, 
1999). 

Decisions involve the application of rea- 
soning to select some course of action 
that achieves the desired goal (see LeBoeuf 
& Shafir, Chap. 11). Hastie (2001) identi- 
fied three components of decision making: 
(1) choice options and courses of actions; 
(2) beliefs about objective states, processes, 
and events in the world, including out- 
come states and means to achieve them; and 
(3) desires, values, or utilities that describe 
the consequences associated with the out- 
comes of each action-event combination. 
Reasoning plays a major role in this pro- 
cess. In this context, a major thrust of re- 
search has been the study of hypothesis test- 
ing, which has been studied widely in the 
medical domain. Such research has shown 
the pervasiveness of confirmation bias, ev- 
idenced by the generation of a hypothesis 
and the subsequent search for evidence con- 
sistent with the hypothesis, often leading to 
failure to adequately consider alternative di- 
agnostic possibilities. This bias may result in 
a less-than-thorough investigation with pos- 
sible adverse consequences for the patient. 
A desire to confirm one’s preferred hypoth- 
esis, moreover, may contribute to increased 
inefficiency and costs by ordering additional 
laboratory tests that will do little to revise 
one’s opinion, providing largely redundant 
data (Chapman & Elstein, 2000). 

Health-care team decision making is the 
rule rather than the exception in medicine. 
Naturalistic decision-making (NDM) is con- 
cerned with the study of cognition in real 
world work environments that are often 
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1993). The majority of this research com- 
bines conventional protocol analytic meth- 
ods with innovative methods designed to 
investigate reasoning and behavior in re- 
alistic settings (Woods, 1993; Rasmussen, 
Pejtersen, & Goodstein, 1994). The study 
of decision making in the work context 
necessitates an extended cognitive science 
framework beyond typical characterizations 
of knowledge structures, processes, and 
skills to include modulating variables such 
as stress, time pressure, and fatigue, as 
well as communication patterns in team 
performance. 

Among the issues investigated in NDM 
are understanding how decisions are jointly 
negotiated and updated by participants dif- 
fering substantially in their areas of expertise 
(e.g., pharmacology, respiratory medicine), 
how the complex communication process in 
these settings occurs, what role technology 
plays in mediating decisions and how it af- 
fects reasoning, and what sources of error 
exist in the decision making process. 

Patel, Kaufman, and Magder (1996) stud- 
ied decision-making in a medical intensive 
care unit with the particular objective of de- 
scribing jointly negotiated decisions, com- 
munication processes, and the development 
of expertise. Intensive care decision-making 
is characterized by a rapid serial evaluation 
of options leading to immediate action in 
which reasoning is schema-driven in a for 
ward direction toward action with minimal 
inference or justification. When patients do 
not respond in a manner consistent with the 
original hypothesis, however, the original de- 
cision comes under scrutiny. This strategy 
can result in a brainstorming session in which 
the team retrospectively evaluates and re- 
considers the decision and considers possible 
alternatives. In such circumstances, various 
patterns of reasoning are used to evaluate 
alternatives in these brainstorming sessions. 
These include probabilistic reasoning, diag- 
nostic reasoning, and biomedical causal rea- 
soning. Supporting decision making in clini- 
cal settings necessitates an understanding of 
communication patterns. 

In summary, although traditional ap- 
proaches to decision making looked at deci- 


real-world decision-making is best investi- 
gated by a naturalistic approach in which 
reasoning is constrained by dynamic factors, 
such as stress, time pressure, risk, and team 
interactions. Looking at medical reasoning 
in social and collaborative settings is even 
more important when information tech- 
nologies are part of the ebb and flow of clini- 
cal work. 


Reasoning and Medical Education 


The failures and successes of reasoning 
strategies and skills can be traced back to 
their sources — education. There is evidence 
suggesting that the way physicians reason 
follows from the way they were educated. 
Medical education in North America as well 
as in the rest of the world has followed a sim- 
ilar path — from practice-based training to an 
increasingly scientific type of training. 

Motivated by the increasing importance 
of basic scientific knowledge in the context 
of clinical practice, problem-based learning 
(PBL) was developed on the premise that 
not only should physicians possess the or- 
dered and systematic knowledge of science, 
but they should think like scientists during 
their practices. Consistent with this idea, 
an attempt was made to teach hypothetico- 
deductive reasoning to medical students 
to provide an adequate structure to med- 
ical problem solving. After all, this was 
the way scientists were supposed to make 
discoveries. 

Based on cognitive research in other 
knowledge domains, some researchers ar- 
gued, however, that the hypothetico- 
deductive method might not be the most 
efficient way of solving clinical problems. To 
investigate how the kind of training medi- 
cal students received affected their reasoning 
patterns, Patel, Groen, and Norman (1993) 
looked at the problem-solving processes of 
students in two medical schools with dif- 
ferent modes of instruction — classical and 
problem-based. They found that students in 
the problem-based curriculum reasoned in 
a way that was consistent with their train- 
ing methods, showing a preponderance of 
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sive elaborations of biomedical information. 
The PBL students have been shown to 
use hypothesis-driven reasoning — from the 
hypothesis to explain the patient data — 
whereas non-PBL students use mainly data- 
driven reasoning — from data toward the hy- 
pothesis. In explaining clinical cases, PBL 
students produce extensive elaborations us- 
ing detailed biomedical information, which 
is relatively absent from non-PBL students’ 
explanations. However, these elaborations 
result in the generation of errors. Problem- 
based learning promotes the activation and 
elaboration of prior knowledge. 

Patel and colleagues (Patel, Arocha, & 
Lecissi, 2001) also investigated the effects 
of non-PBL curricula on the use and inte- 
gration of basic science and clinical knowl- 
edge and their relationship to reasoning in 
diagnostic explanation. The results showed 
that biomedical and clinical knowledge are 
not integrated and that very little biomedi- 
cal information is used in routine problem- 
solving situations. There is significant use of 
expert-like, data-driven strategies, however, 
in non-PBL students’ explanations. The use 
of biomedical information increases when 
the clinical problems are complex; at the 
same time, hypothesis-driven strategies re- 
place the data-driven strategies. 

Students from a PBL school integrated 
the two types of knowledge and, in contrast 
to the non-PBL students, they spontaneously 
used biomedical information in solving even 
routine problems. We concluded that, for 
students in the non-PBL curriculum, the 
clinical components of problems are treated 
separately from the biomedical science com- 
ponents. The two components of problem 
analysis seem to be viewed as serving dif- 
ferent functions. When needed, however, 
biomedical knowledge is and seems to act 
as a “glue” that ties the two kinds of infor 
mation together. 

In the PBL curriculum, the integration of 
basic science and clinical knowledge is so 
tight that students appear unable to separate 
the two. As a result, PBL students generate 
unnecessarily elaborate explanations, lead- 
ing to errors of reasoning. Problem-based 


ing in which basic biomedical knowledge 
becomes so tightly tied to specific clinical 
problem types that it becomes difficult to 
decouple this knowledge in context to trans- 
fer to a new situation (Anderson, Reder, & 
Simon, 1996; Holyoak, 1984). 

This outcome is consistent with how 
biomedical information is taught in the class- 
room in PBL schools — by encouraging use of 
the hypothetico-deductive method, result- 
ing in a predominantly backward-directed 
mode of reasoning. Elaborations are accom- 
panied by a tendency to generate errors of 
scientific fact and flawed patterns of ex- 
planation, such as circular reasoning. Even 
though a student’s explanation may be rid- 
dled with bugs and misconceptions, their 
harmful effects may be dependent on the di- 
rection of reasoning. If they reason forward, 
then they are likely to view their existing 
knowledge as adequate. In this case, miscon- 
ceptions may be long lasting and difficult 
to eradicate. If they reason backward, mis- 
conceptions might best be viewed as tran- 
sient hypotheses that, in the light of expe- 
rience, are refuted or modified to form the 
kernel of a more adequate explanation. In- 
terestingly, differences in the patterns of rea- 
soning acquired in both PBL and non-PBL 
medical curricula are found to be quite sta- 
ble, even after the students have completed 
medical school and are in residency training 
programs (Patel, Arocha, Lecissi, 2001; Patel 
& Kaufman, 2001). 

Instruction that emphasizes decontextu- 
alized abstracted models of phenomena has 
not yielded much success in medicine or 
in other spheres of science education. It is 
widely believed that the amount of transfer 
will be a function of the overlap between 
the original domain of learning and the tar- 
get domain (Holyoak, 1984). Problem-based 
learning’s emphasis on real-world problems 
represents a very good source of transfer to 
clinical situations. However, it is very chal- 
lenging to create a problem set that most ef- 
fectively embodies certain biomedical con- 
cepts while maximizing transfer. Knowledge 
that is overly contextualized actually can re- 
duce transfer. 
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All technologies mediate human perfor 
mance. Technologies, whether they be 
computer-based or in some other form, 
transform the ways individuals and groups 
behave. They do not merely augment, en- 
hance, or expedite performance, although 
a given technology may do all of these 
things. The difference is not one of quan- 
titative change but one that is qualitative 
in nature. Technology, tools, and artifacts 
enhance people’s ability to perform tasks 
and change the way they perform tasks. 
In cognitive science, this ubiquitous phe- 
nomenon is called the representational effect, 
which refers to the phenomenon that dif- 
ferent representations of a common abstract 
structure can generate dramatically differ- 
ent representational efficiencies, task com- 
plexities, and behavioral outcomes (Zhang & 
Norman, 1994). 


Technology as External 
Representations 


One approach to the study of how tech- 
nology mediates thinking and reasoning is 
to consider technology as external repre- 
sentations (Zhang & Norman, 1994, 1995; 
Zhang, 1997). External representations are 
the knowledge and structure in the envi- 
ronment as physical symbols, objects, or di- 
mensions (e.g., written symbols, beads of 
an abacus, dimensions of a graph), and as 
external rules, constraints, or relations em- 
bedded in physical configurations (e.g., spa- 
tial relations of written digits, visual and 
spatial layouts of diagrams, physical con- 
straints in abacuses). The information in ex- 
ternal representations can be picked up, an- 
alyzed, and processed by perceptual systems 
alone, although the top-down participation 
of conceptual knowledge from internal rep- 
resentations sometimes facilitates or inhibits 
the perceptual processes. External represen- 
tations are more than inputs and stimuli 
to the internal mind. For many tasks, they 
are intrinsic components without which the 


in nature. 

Diagrams, graphs, pictures, and informa- 
tion displays are typical external represen- 
tations. They are used in many cognitive 
tasks such as problem solving, reasoning, 
and decision making. In studies of the rela- 
tionship between mental images and exter- 
nal pictures, Chambers and Reisberg (1985; 
Reisberg, 1987) showed that external rep- 
resentations could give people access to 
knowledge and skills that are unavailable 
from internal representations. This advan- 
tage typically arises because internal rep- 
resentations are already interpreted and 
difficult to change, whereas external repre- 
sentations are subject to interpretations and 
can lead to different understandings under 
different conditions. In their studies of dia- 
grammatic problem solving, Larkin & Simon 
(1987; Larkin, 1989) show that diagram- 
matic representations help reasoning and 
problem solving because they support oper- 
ators that can recognize features easily and 
make inferences directly. In studies of log- 
ical reasoning with diagrams, Stenning and 
Oberlander (1994) demonstrated that dia- 
grammatic representations such as Euler cir- 
cles limit abstraction and thereby ease pro- 
cessing effort. It is well known that different 
forms of graphic displays have different rep- 
resentational efficiencies for different tasks 
and can cause different cognitive behav- 
iors. For example, Kleinmuntz and Schkade 
(1993) showed that different representa- 
tions (graphs, tables, and lists) of the same in- 
formation can dramatically change decision- 
making strategies: With a tabular display, 
people made one decision, but with a graph 
display of the same information, people 
made a different decision. 


The Impact of Technology on 
Thinking in Medicine 


The mediating role of technology can be 
evaluated at several levels. For example, elec- 
tronic medical records alter the practice of 
individual clinicians in significant ways, as 
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mation system substantially impact organi- 
zational and institutional practices, from re- 
search to billing to quality assurance. Even 
the introduction of patient-centered medical 
records early in the twentieth century neces- 
sitated changes in hospital architecture and 
considerably affected work practices in clin- 
ical settings. Salomon, Perkins, and Glober- 
son (1991) introduced a useful distinction 
in considering the mediating role of tech- 
nology on individual performance — the ef- 
fects with technology and the effects of tech- 
nology. The former is concerned with the 
changes in performance displayed by users 
while equipped with the technology. For 
example, when using an effective medical 
information system, physicians should be 
able to gather information more systemat- 
ically and efficiently. In this capacity, med- 
ical information technologies may alleviate 
some of the cognitive load associated with 
a given task and permit physicians to focus 
on higher-order thinking skills, such as hy- 
pothesis generation and evaluation. The ef- 
fects of technology refer to enduring changes 
in general cognitive capacities (knowledge 
and skills) as a consequence of interaction 
with a technology. For example, frequent 
use of information technologies may re- 
sult in lasting changes in medical decision- 
making practices even in the absence of 
the system. 

In several studies involving the mediating 
role of technology in clinical practice, Patel 
and colleagues (Patel et al., 2000) observed 
the change in thinking and reasoning pat- 
terns caused by the change in methods of 
writing patient records, from paper records 
to electronic medical records (EMR). They 
found that before using EMR, physicians fo- 
cused on exploration and discovery, used 
complex propositions, and tended to use 
data-driven reasoning. With EMR, which 
structures data, physicians focus on problem 
solving, use simple propositions, and tend to 
use problem-directed and hypothesis-driven 
reasoning. The change of behavior caused by 
the use of EMR remains when physicians go 
back to paper records, showing the enduring 
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in medicine. 

As the basis for many medical decisions, 
diagnostic reasoning requires collecting, 
understanding, and using many types of 
patient information, such as history, lab- 
oratory results, symptoms, prescriptions, 
images, and so on. It is affected by the 
expertise of the clinicians and the way the 
information is acquired, stored, processed, 
and presented. If we consider clinicians 
as rational decision makers, the format of 
a display, as long as it contains the same 
information, should not affect the outcome 
of the reasoning and decision-making 
process. But the formats of displays do 
affect many aspects of clinicians’ task per- 
formance. Several recent studies examined 
how different displays of information in 
EMR affect clinicians’ behavior. Three 
major types of displays have been studied — 
source-based, time-based, and concept- 
based. Source-based displays organize 
medical data by the sources of the data, 
such as encounter notes, laboratory results 
and reports, medications, radiology imaging 
and reports, physical examinations, and so 
on. Time-based displays organize medical 
data as a temporal history. Concept-based 
displays organize medical data by clinically 
meaningful concepts or problems. In this 
case, all data related to each specific prob- 
lem are displayed together. For example, if 
a patient has symptoms such as coughing, 
chest pain, and fever, the laboratory results, 
imaging reports, prescriptions, assessments, 
and plans are displayed together. A study 
by Zeng, Cimino, & Zou, (2002) found that 
different displays were good for different 
tasks. Source-based displays are good for 
clinicians to retrieve information for a 
specific test or procedure from a specific 
department, for example, whereas concept- 
based displays are good for searching for 
information related to a specific disease. 

With the rapid growth of computer-based 
information systems, we are interacting 
more and more with computer-generated 
health information displays. If these displays 
are to generate the information people need 
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rately, good design is necessary. 


Conclusions and Future Directions 


The process of medical reasoning is one area 
in which advances in cognitive science have 
made significant contributions to investiga- 
tion. In particular, reasoning in medical con- 
texts involving a dense population and a 
high degree of uncertainty (such as criti- 
cal care environments), compounded with 
constraints imposed by resource availabil- 
ity, leads to increased use of heuristic strate- 
gies. The utility of heuristics lies in limit- 
ing the extent of purposeful search through 
data sets, which have substantial practical 
value by reducing redundancy. A signifi- 
cant part of a physician’s cognitive effort is 
based on heuristic thinking, but its use in- 
troduces considerable bias in medical rea- 
soning, often resulting in a number of con- 
ceptual and procedural errors. These include 
misconceptions about laws governing prob- 
ability, instantiation of general rules to a 
specific patient at the point of care, prior 
probabilities and actions, and false valida- 
tion. Much of physicians’ reasoning is in- 
ductive with attached probability. Human 
thought is fallible and we cannot appreciate 
the fallibility of our thinking unless we draw 
on understanding of how physicians’ think- 
ing processes operate in the real working 
environment. 

Cognitive studies are increasingly mov- 
ing toward investigations of real-world phe- 
nomena. The constraints of laboratory-based 
work prevent capturing the dynamics of 
real-world problems. This problem is partic- 
ularly salient in high-velocity critical care en- 
vironments. In the best-case scenarios, this is 
creating the potential for great synergy be- 
tween laboratory-based research and cogni- 
tive studies in the “wild.” As discussed in this 
chapter, studies of thinking and reasoning 
in medicine, including a focus on medical 
errors and technology-mediated cognition, 
are increasingly paying attention to dimen- 
sions of medical work in clinical settings. 


reducing medical errors provides an oppor- 
tunity for cognitive scientists to apply cogni- 
tive theories and methodologies to a press- 
ing practical problem. A trend in health 
care, spurred partly by the advent of infor- 
mation technologies that foster communi- 
cation, is the shift in health-care systems to 
become increasingly multidisciplinary, col- 
laborative, and geographically spanning re- 
gions. In addition, increasing costs of health 
care and rapid knowledge growth have ac- 
celerated the trend toward collaboration of 
health-care professionals in sharing knowl- 
edge and skills. Comprehensive patient care 
necessitates the communication of health- 
care providers in different medical domains, 
thereby optimizing the use of their exper- 
tise. Research on reasoning will need to con- 
tinue to move toward a distributed model 
of cognition. This model will include a fo- 
cus on both socially shared and technology- 
mediated reasoning. 
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CHAPTER 31 


Intelligence 


Robert J. Sternberg 


What is intelligence? This chapter discusses 
the nature of intelligence and related is- 
sues. The chapter is divided into several ma- 
jor parts: The first discusses people’s con- 
ceptions of intelligence, also referred to as 
implicit theories of intelligence; the second 
presents a brief discussion of intelligence 
testing; the third offers a review of major ap- 
proaches to understanding intelligence; the 
fourth discusses how intelligence can be im- 
proved; and the last part briefly draws some 
conclusions. The chapter does not discuss ar- 
tificial intelligence and computer simulation 
(see Lovett & Anderson, Chap. 17), neural 
networks, or parallel distributed processing 
(see Doumas & Hummel, Chap. 4). 


Implicit Theories of Intelligence 


What do people believe intelligence to be? 
In 1921, when the editors of the Jour- 
nal of Educational Psychology asked 14 fa- 
mous psychologists that question, the re- 
sponses varied but generally embraced two 
themes: Intelligence involves the capacity to 


learn from experience and the ability 
to adapt to the surrounding environ- 
ment. Sixty-five years later, Sternberg and 
Detterman (1986) asked twenty-four cog- 
nitive psychologists with expertise in intel- 
ligence research the same question. They, 
too, underscored the importance of learning 
from experience and adapting to the envi- 
ronment. They also broadened the definition 
to emphasize the importance of metacog- 
nition — people’s understanding and control 
of their own thinking processes. Contempo- 
rary experts also more heavily emphasized 
the role of culture, pointing out that what 
is considered intelligent in one culture may 
be considered stupid in another (Serpell, 

2000). Intelligence, then, is the capacity 
to learn from experience, using metacogni- 
tive processes to enhance learning, and the 
ability to adapt to the surrounding environ- 
ment, which may require different adap- 
tations within different social and cultural 
contexts. 

According to the Oxford English Dictio- 
nary, the word intelligence entered our lan- 
guage in about the twelfth century. Today, 
we can look up intelligence in numerous 
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own implicit (unstated) ideas about what it 
means to be smart; that is, we have our own 
implicit theories of intelligence. We use our 
implicit theories in many social situations, 
such as when we meet people or when we 
describe people we know as being very smart 
or not so smart. 

Within our implicit theories of intelli- 
gence, we also recognize that it has differ- 
ent meanings in different contexts. A smart 
salesperson may show a different kind of in- 
telligence than a smart neurosurgeon or a 
smart accountant, each of whom may show 
a different kind of intelligence than a smart 
choreographer, composer, athlete, or sculp- 
tor (see Sternberg et al., Chap. 15, for a 
discussion of the related concept of creativ- 
ity). We often, use our implicit and context- 
relevant definitions of intelligence to make 
assessments of intelligence. Is your mechanic 
smart enough to find and fix the problem in 
your car? Is your physician smart enough to 
find and treat your health problem? Is this 
attractive person smart enough to hold your 
interest in a conversation? 

Western notions about intelligence are 
not always shared by other cultures (Stern- 
berg & Kaufman, 1998). For example the 
Western emphasis on speed of mental pro- 
cessing (Sternberg et al., 1981) is not shared 
in many cultures. Other cultures may even 
be suspicious of the quality of work that 
is done very quickly. Indeed, other cultures 
emphasize depth rather than speed of pro- 
cessing. Even in the West, some prominent 
theorists have pointed out the importance 
of depth of processing for full command of 
material (e.g., Craik & Lockhart, 1972). 

Even within the United States, many peo- 
ple have started viewing as important not 
only the cognitive aspects but also the emo- 
tional aspects of intelligence. Mayer, Salovey, 
and Caruso (2000, p. 396) defined emo- 
tional intelligence as “the ability to perceive 
and express emotion, assimilate emotion in 
thought, understand and reason with emo- 
tion, and regulate emotion in the self and 
others.” There is good evidence for the exis- 
tence of some kind of emotional intelligence 
(Ciarrochi, Forgas, & Mayer, 2001; Mayer 


2000; Salovey & Sluyter, 1997), although 
the evidence is mixed (Davies, Stankov, & 
Roberts, 1998). 

A related concept is that of social intel- 
ligence, the ability to understand and inter- 
act with other people (Kihlstrom & Cantor, 

2000). Research also shows that person- 
ality variables are related to intelligence 
(Ackerman, 1996). 

Explicit definitions of intelligence fre- 
quently take on an assessment-oriented fo- 
cus. In fact, some psychologists, such as Ed- 
win Boring (1923), have defined intelligence 
as whatever it is that the tests measure. 
This definition, unfortunately, is circular 
and, moreover, what different tests of intelli- 
gence measure is not always the same. Differ- 
ent tests measure somewhat different con- 
structs (Daniel, 1997, 2000; Embretson & 
McCollam, 2000; Kaufman, 2000; Kaufman 
& Lichtenberger, 1998), so it is not feasible 
to define intelligence by what tests test, as 
though they all measured the same thing. Al- 
though most cognitive psychologists do not 
go to that extreme, the tradition of attempt- 
ing to understand intelligence by measuring 
various aspects of it has a long history (Brody, 
2000). 


Intelligence Testing 


History 


Contemporary measurements of intelli- 
gence usually can be traced to one of two 
very different historical traditions. One tra- 
dition concentrated on lower level, psy- 
chophysical abilities (such as sensory acuity, 
physical strength, and motor coordination); 
the other focused on higher level, judgment 
abilities (which we traditionally describe as 
related to thinking). 

Francis Galton (1822-1911) believed that 
intelligence was a function of psychophysical 
abilities and, for several years, Galton main- 
tained a well-equipped laboratory where vis- 
itors could have themselves measured on a 
variety of psychophysical tests. These tests 
measured a broad range of psychophysical 
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crimination (the ability to notice small dif- 
ferences in the weights of objects), pitch 
sensitivity (the ability to hear small differ- 
ences between musical notes), and phys- 
ical strength (Galton, 1883). One of the 
many enthusiastic followers of Galton, Clark 
Wissler (1901), attempted to detect links 
among the assorted tests, which would unify 
the various dimensions of psychophysically 
based intelligence. Much to Wissler’s dis- 
may, no unifying association could be de- 
tected. Moreover, the psychophysical tests 
did not predict college grades. The psy- 
chophysical approach to assessing intelli- 
gence soon faded almost into oblivion, al- 
though it would reappear many years later. 

An alternative to the psychophysical 
approach was developed by Alfred Bi- 
net (1857-1911). He and his collaborator, 
Theodore Simon, also attempted to assess 
intelligence, but their goal was much more 
practical. Binet had been asked to devise a 
procedure to distinguish normal from men- 
tally retarded learners in an academic setting 
(Binet & Simon, 1916). In Binet’s view, judg- 
ment not psychophysical acuity, strength, or 
skill, is the key to intelligence. For Binet 
(Binet & Simon, 1916), intelligent thought — 
mental judgment — comprises three distinct 
elements: direction, adaptation, and criti- 
cism. The importance of direction and adap- 
tation certainly fits with contemporary views 
of intelligence, and Binet’s notion of criti- 
cism actually seems prescient, considering 
the current appreciation of metacognitive 
processes as a key aspect of intelligence. Bi- 
net viewed intelligence as a broad potpourri 


of cognitive and other abilities and as highly 
modifiable. 


Major Intelligence Scales 


Lewis Terman of Stanford University built 
on Binet and Simon’s work in Europe and 
constructed the earliest version of what has 
come to be called the Stanford-Binet In- 
telligence Scales (Terman & Merrill, 1937, 
1973; Thorndike, Hagen, & Sattler, 1986). 
For years, the Stanford-Binet test was the 
standard for intelligence tests, and it is still 


scales. The Wechsler tests yield three scores — 
a verbal score, a performance score, and an 
overall score. The verbal score is based on 
tests such as vocabulary and verbal simi- 
larities in which the test-taker has to say 
how two things are similar. The performance 
score is based on tests such as picture com- 
pletion, which requires identification of a 
missing part in a picture of an object; and 
picture arrangement, which requires rear- 
rangement of a scrambled set of cartoon-like 
pictures into an order that tells a coherent 
story. The overall score is a combination of 
the verbal and performance scores. 

Although Wechsler clearly believed in the 
worth of attempting to measure intelligence, 
he did not limit his conception of intelli- 
gence to test scores. Wechsler believed that 
intelligence is not represented just by a test 
score or even by what we do in school. We 
use our intelligence not just in taking tests 
and in doing homework, but also in relating 
to people, in performing our jobs effectively, 
and in managing our lives in general. 


Approaches to Intelligence 


Psychometric Approaches to Intelligence 


Psychologists interested in the structure of 
intelligence have relied on factor analysis as 
an indispensable tool for their research. Fac- 
tor analysis is a statistical method for sepa- 
rating a construct — intelligence in this case — 
into anumber of hypothetical factors or abil- 
ities the researchers believe to form the basis 
of individual differences in test performance. 
The specific factors derived, of course, still 
depend on the specific questions being asked 
and the tasks being evaluated. 

Factor analysis is based on studies of cor- 
relation. The idea is that the more highly 
two tests are correlated the more likely they 
are to measure the same thing. In research 
on intelligence, a factor analysis might in- 
volve these steps: (1) Give a large number 
of people several different tests of ability. 
(2) Determine the correlations among all 
those tests. (3) Statistically analyze those 
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small number of factors that summarize peo- 
ple’s performance on the tests. The investi- 
gators in this area have generally agreed on 
and followed this procedure, yet the result- 
ing factorial structures of intelligence have 
differed among theorists such as Spearman, 
Thurstone, Guilford, Cattell, Vernon, and 
Carroll. 


SPEARMAN: THEORY OF G 


Charles Spearman is usually credited with 
inventing factor analysis (Spearman, 1927). 
Using factor-analytic studies, Spearman con- 
cluded that intelligence can be understood 
in terms of both a single general factor that 
pervades performance on all tests of mental 
ability and a set of specific factors, each of 
which is involved in performance on only a 
single type of mental-ability test (e.g., arith- 
metic computations). In Spearman’s view, 
the specific factors are of only casual inter- 
est because of their narrow applicability. To 
Spearman, the general factor, which he la- 
beled “g,” provides the key to understand- 
ing intelligence. Spearman believed g to be 
attributable to “mental energy.” Many psy- 
chologists still believe Spearman's theory to 
be essentially correct (e.g., Jensen, 1998; see 
essays in Sternberg & Grigorenko, 2002). 
The theory is useful in part because g ac- 
counts for a sizable, although not fixed, per- 
centage of variance in school and job per 
formance, usually somewhere between 5% 
and 40% (Jensen, 1998). Spearman (1923) 
provided a cognitive theory of intelligence. 
He suggested that intelligence comprises 
apprehension of experience (encoding of 
stimuli), eduction of relations (inference of 
relations), and eduction of correlates (appli- 
cation of what is learned). He therefore may 
have been the earliest serious cognitive the- 
orist of intelligence. 


THURSTONE: PRIMARY MENTAL ABILITIES 


In contrast to Spearman, Louis Thurstone 
(1887-1955) concluded (Thurstone, 1938) 
that the core of intelligence resides not in 
one single factor but in seven such factors, 
which he referred to as primary mental abili- 


cabulary tests; verbal fluency, measured by 
time-limited tests requiring the test-taker to 
think of as many words as possible that be- 
gin with a given letter; inductive reason- 
ing, measured by tests such as analogies 
and number-series completion tasks; spatial 
visualization, measured by tests requiring 
mental rotation of pictures of objects, num- 
ber, measured by computation and simple 
mathematical problem-solving tests; mem- 
ory, measured by picture and word-recall 
tests; and perceptual speed, measured by 
tests that require the test-taker to recognize 
small differences in pictures or to cross out 
a “each time it appear in a string” of varied 
letters. 


GUILFORD: THE STRUCTURE OF INTELLECT 


At the opposite extreme from Spearman’s 
single g-factor model is J. P. Guilford’s (1967, 
1982, 1988) structure-of-intellect model, 
which includes up to 150 factors of the mind 
in one version of the theory. According to 
Guilford, intelligence can be understood in 
terms of a cube that represents the intersec- 
tion of three dimensions — operations, con- 
tents, and products. Operations are simply 
mental processes, such as memory and evalu- 
ation (making judgments, such as determin- 
ing whether a particular statement is a fact 
or opinion). Contents are the kinds of terms 
that appear in a problem, such as seman- 
tic (words) and visual (pictures). Products 
are the kinds of responses required, such as 
units (single words, numbers, or pictures), 
classes (hierarchies), and implications. Thus, 
Guilford’s theory, like Spearman’s, had an 
explicit cognitive component. 


CATTELL, VERNON, AND CARROLL: 
HIERARCHICAL MODELS 

A more parsimonious way of handling a 
number of factors of the mind is through 
a hierarchical model of intelligence. One 
such model, developed by Raymond Cat- 
tell (1971), proposed that general intelli- 
gence comprises two major subfactors — fluid 
ability (speed and accuracy of abstract rea- 
soning, especially for novel problems) and 
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and vocabulary). Subsumed within these 
two major subfactors are other, more spe- 
cific factors. A similar view was proposed by 
Philip E. Vernon (1971), who made a gen- 
eral division between practical-mechanical 
and verbal-educational abilities. 

More recently, John B. Carroll (1993) pro- 
posed a hierarchical model of intelligence 
based on his analysis of more than 460 data 
sets obtained between 1927 and 1987. His 
analysis encompasses more than 130,000 
people from diverse walks of life and even 
countries of origin (although non-English- 
speaking countries are poorly represented 
among his data sets). The model Carroll pro- 
posed, based on his monumental undertak- 
ing, is a hierarchy comprising three strata — 
Stratum I, which includes many narrow, 
specific abilities (e.g., spelling ability, speed 
of reasoning); Stratum II, which includes 
various broad abilities (e.g., fluid intelli- 
gence, crystallized intelligence); and Stra- 
tum III, a single general intelligence, much 
like Spearman’s g. 

In addition to fluid intelligence and crys- 
tallized intelligence, Carroll includes in the 
middle stratum learning and memory pro- 
cesses, visual perception, auditory percep- 
tion, facile production of ideas (similar to 
verbal fluency), and speed (which includes 
both sheer speed of response and speed of 
accurate response). Although Carroll does 
not break new ground in that many of the 
abilities in his model have been mentioned 
in other theories, he does masterfully inte- 
grate a large and diverse factor-analytic lit- 
erature, thereby giving great authority to 
his model. Whereas the factor-analytic ap- 
proach has tended to emphasize the struc- 
tures of intelligence, the cognitive approach 
has tended to emphasize the operations of 
intelligence. 


Cognitive Approaches to Intelligence 


Cognitive theorists are interested in study- 
ing how people (or other organisms; Zentall, 
2000) mentally represent and process what 
they learn and know about the world. The 
ways in which various cognitive investigators 


of the complexity of the processes being 
studied. Among the advocates of this ap- 
proach have been Ted Nettelbeck, Arthur 
Jensen, Earl Hunt, Herbert Simon, and my- 
self. Each of these researchers has considered 
both the speed and the accuracy of infor- 
mation processing to be important factors 
in intelligence. In addition to speed and ac- 
curacy of processing, Hunt considered ver- 
bal versus spatial skill, as well as attentio- 


nal ability. 


INSPECTION TIME 


Nettelbeck (e.g., 1987; Nettelbeck & Lally, 
1976; Nettelbeck & Rabbitt, 1992; see also 
Deary, 2000, 2002; Deary & Stough, 1996) 
suggested a speed-related indicator of intelli- 
gence involving the encoding of visual infor- 
mation for brief storage in working memory. 
But what is critical in this view is not speed 
of response but rather the length of time 
a stimulus must be presented for the sub- 
ject to be able to process that stimulus. The 
shorter the presentation length, the higher 
the score. The key variable is the length of 
time for the presentation of the target stim- 
ulus, not the speed of responding by press- 
ing the button. Nettelbeck operationally de- 
fined inspection time as the length of time 
for presentation of the target stimulus after 
which the participant still responds with at 
least 90% success. Nettelbeck (1987) found 
that shorter inspection times correlate with 
higher scores on intelligence tests [e.g., var- 
ious subscales of the Wechsler Adult Intel- 
ligence Scale (WAIS)] among differing pop- 
ulations of participants. Other investigators 
have confirmed this finding (e.g., Deary & 
Stough, 1996). 


CHOICE REACTION TIME 


Arthur Jensen (1979, 1998, 2002) empha- 
sized a different aspect of information- 
processing speed; specifically, he proposed 
that intelligence can be understood in terms 
of speed of neuronal conduction. In other 
words, the smart person is someone whose 
neural circuits conduct information rapidly. 
When Jensen proposed this notion, direct 
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were not readily available, so Jensen primar- 
ily studied a proposed proxy for measuring 
neural-processing speed — choice reaction 
time, the time it takes to select one answer 
from among several possibilities. For exam- 
ple, suppose that you are one of Jensen’s 
participants. You might be seated in front 
of a set of lights on a board. When one of 
the lights flashed, you would be expected 
to extinguish it by pressing as rapidly as 
possible a button beneath the correct light. 
The experimenter would then measure your 
speed in performing this task. Jensen (1982) 
found that participants with higher intelli- 
gence quotients (IQs) are faster than par- 
ticipants with lower IQs in their reaction 
time (RT), the time between when a light 
comes on and the finger leaves the home 
(central) button. In some studies, partici- 
pants with higher IQs also showed a faster 
movement time, the time between letting 
the finger leave the home button and hitting 
the button under the light. Based on such 
tasks, Reed and Jensen (1991, 1993) pro- 
pose that their findings may be attributable 
to increased central nerve-conduction veloc- 
ity, although at present this proposal remains 
speculative. 

More recently, researchers have suggested 
that various findings regarding choice RT 
may be influenced by the number of re- 
sponse alternatives and the visual-scanning 
requirements of Jensen's apparatus rather 
than being attributable to the speed of RT 
alone (Bors, MacLeod, & Forrin, 1993). In 
particular, Bors and colleagues found that 
manipulating the number of buttons and the 
size of the visual angle of the display could 
reduce the correlation between IQ and RT. 
Thus, the relation between reaction time and 
intelligence is unclear. 


LEXICAL ACCESS SPEED AND SPEED OF 
SIMULTANEOUS PROCESSING 

Like Jensen, Earl Hunt (1978) suggested 
that intelligence be measured in terms of 
speed. However, Hunt has been particu- 
larly interested in verbal intelligence and 
has focused on lexical-access speed — the 
speed with which we can retrieve informa- 


in our long-term memories. To measure this 
speed, Hunt proposed a letter-matching RT 
task (Posner & Mitchell, 1967). 

For example, suppose that you are one 
of Hunt’s participants. You would be shown 
pairs of letters, such as “A A,” “A a,” or “A b.” 
For each pair, you would be asked to indi- 
cate whether the letters constitute a match 
in name (e.g., “A a” match in name of let- 
ter of the alphabet but “A b” do not). You 
would also be given a simpler task, in which 
you would be asked to indicate whether the 
letters match physically (e.g., “A A” are phys- 
ically identical, whereas “A a” are not). Hunt 
would be particularly interested in discern- 
ing the difference between your speed for 
the first set of tasks, involving name match- 
ing, and your speed for the second set, in- 
volving matching of physical characteristics. 
Hunt would consider the difference in your 
reaction time for each task to indicate a mea- 
sure of your speed of lexical access. Thus, 
he would subtract from his equation the 
physical-match reaction time. For Hunt, the 
response time in indicating that “A A” is a 
physical match is unimportant. What inter- 
ests him is a more complex reaction time — 
that for recognizing names of letters. He and 
his colleagues have found that students with 
lower verbal ability take longer to gain ac- 
cess to lexical information than do students 
with higher verbal ability. 

Earl Hunt and Marcy Lansman (1982) 
also studied people’s ability to divide their 
attention as a function of intelligence. For 
example, suppose that you are asked to solve 
mathematical problems and simultaneously 
to listen for a tone and press a button as soon 
as you hear it. We can expect that you would 
both solve the math problems effectively 
and respond quickly to hearing the tone. 
According to Hunt and Lansman, one thing 
that makes people more intelligent is that 
they are better able to timeshare between 
two tasks and to perform both effectively. 

In sum, process timing theories attempt 
to account for differences in intelligence 
by appealing to differences in the speed 
of various forms of information process- 
ing; inspection time, choice RT, and lexical 
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relate with measures of intelligence. These 
findings suggest that higher intelligence 
may be related to the speed of various 
information-processing abilities, including 
encoding information more rapidly into 
working memory, accessing information in 
long-term memory more rapidly, and re- 
sponding more rapidly. 

Why would more rapid encoding, re- 
trieval, and responding be associated with 
higher intelligence test scores? Do rapid 
information processors learn more? Other 
research on learning in aged persons investi- 
gated whether there is a link between age- 
related slowing of information processing 
and (1) initial encoding and recall of infor- 
mation and (2) long-term retention (Nettel- 
beck et al., 1996; Bors & Forrin, 1995). The 
findings suggest that the relation between in- 
spection time and intelligence may not be 
related to learning. In particular, Nettelbeck 
et al. found there is a difference between 
initial recall and actual long-term learning — 
whereas initial recall performance is me- 
diated by processing speed (older, slower 
participants showed deficits), longer-term 
retention of new information (preserved in 
older participants) is mediated by cognitive 
processes other than speed of processing, 
including rehearsal strategies. This implies 
speed of information processing may influ- 
ence initial performance on recall and in- 
spection time tasks, but speed is not related 
to long-term learning. Perhaps faster infor- 
mation processing aids participants in per 
formance aspects of intelligence test tasks, 
rather than contributing to actual learn- 
ing and intelligence (see also Salthouse, 
Chap. 24). Clearly, this area requires more 
research to determine how information- 
processing speed relates to intelligence. 


WORKING MEMORY 


Recent work suggests that a critical compo- 
nent of intelligence may be working mem- 
ory (see Morrison, Chap. 19 for a discussion 
of working memory in thinking). Indeed, 
Kyllonen (2002) and Kyllonen and Christal 
(1990) have argued that intelligence may be 


man and Carpenter (1983) had participants 
read sets of passages and, after they had read 
the passages, try to remember the last word 
of each passage. Recall was highly corre- 
lated with verbal ability. Turner and Engle 
(1989) had participants perform a variety 
of working-memory tasks. In one task, for 
example, the participants saw a set of sim- 
ple arithmetic problems, each of which was 
followed by a word or a digit. An example 
would be “Is ((3 x 5) — 6 = 7?” TABLE. The 
participants saw sets of from two to six such 
problems and solved each one. After solving 
the problems in the set, they tried to recall 
the words that followed the problems. The 
number of words recalled was highly corre- 
lated with measured intelligence. It there- 
fore appears that the ability to store and 
manipulate information in working memory 
may be an important aspect of intelligence, 
although probably not all there is to intelli- 
gence (see Morrison, Chap. 19 for discussion 
of working memory and thinking). 


THE COMPONENTIAL THEORY AND COMPLEX 
PROBLEM SOLVING 
In my early work on intelligence, I (Stern- 
berg, 1977) began using cognitive ap- 
proaches to study information processing in 
more complex tasks, such as analogies, se- 
ries problems (e.g., completing a numerical 
or figural series), and syllogisms (Sternberg, 
1977, 1983, 1985). The goal was to find out 
just what made some people more intelli- 
gent processors of information than others. 
The idea was to take the kinds of tasks used 
on conventional intelligence tests and to iso- 
late the components of intelligence — the 
mental processes used in performing these 
tasks, such as translating a sensory input 
into a mental representation, transforming 
one conceptual representation into another, 
or translating a conceptual representation 
into a motor output (Sternberg, 1982). Since 
then, many people have elaborated upon 
and expanded this basic approach (Lohman, 
2000). 

Componential analysis breaks down peo- 
ple’s reaction times and error rates on these 
tasks in terms of the processes that make 
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that people may solve analogies and simi- 
lar tasks by using several component pro- 
cesses including encoding the terms of the 
problem, inferring relations among at least 
some of the terms, mapping the inferred re- 
lations to other terms that would be pre- 
sumed to show similar relations, and apply- 
ing the previously inferred relations to the 
new situations. 

Consider the analogy, LAWYER : 
CLIENT :: DOCTOR : (a. PATIENT b. 
MEDICINE). To solve this analogy, you 
need to encode each term of the problem, 
which includes perceiving a term and re- 
trieving information about it from memory. 
You then infer the relationship between 
lawyer and client — that the former provides 
professional services to the latter. You 
then map the relationship in the first half 
of the analogy to the second half of the 
analogy, noting that it will involve that same 
relationship. Finally, you apply that inferred 
relationship to generate the final term of 
the analogy, leading to the appropriate 
response of PATIENT. Studying these com- 
ponents of information processing reveals 
more than measuring mental speed alone 
(see Holyoak, Chapter 6, for a detailed 
discussion of analogical reasoning). 

When measuring speed alone, I found sig- 
nificant correlations between speed in exe- 
cuting these processes and performance on 
other traditional intelligence tests. However, 
a more intriguing discovery is that partici- 
pants who score higher on traditional intel- 
ligence tests take longer to encode the terms 
of the problem than do less intelligent par- 
ticipants, but they make up for the extra 
time by taking less time to perform the re- 
maining components of the task. In general, 
more intelligent participants take longer dur- 
ing global planning — encoding the problem 
and formulating a general strategy for attack- 
ing the problem (or set of problems) — but 
they take less time for local planning — form- 
ing and implementing strategies for the de- 
tails of the task (Sternberg, 1981). 

The advantage of spending more time 
on global planning is the increased likeli- 
hood that the overall strategy will be cor 


to do something than will less bright peo- 
ple when taking more time is advantageous. 
For example, the brighter person might 
spend more time researching and planning 
a term paper but less time in actually writ- 
ing it. This same differential in time alloca- 
tion has been shown in other tasks as well 
(e.g., in solving physics problems; Larkin et 
al., 1980; Sternberg, 1979, 1985); that is, 
more intelligent people seem to spend more 
time planning for and encoding the prob- 
lems they face but less time in the other 
components of task performance. This may 
relate to the previously mentioned metacog- 
nitive attribute many include in their notions 
of intelligence. The bottom line, then, is that 
intelligence may reside as much in how peo- 
ple allocate time as it does in the amount of 
time it takes them to perform cognitive tasks. 
In a similarly cognitive approach, Simon 
studied the information processing of people 
engaged in complex problem-solving situa- 
tions, such as when playing chess and per- 
forming logical derivations (Newell & Si- 
mon, 1972; Simon, 1976). A simple, brief 
task might require the participant to view an 
arithmetic or geometric series, figure out the 
rule underlying the progression, and guess 
what numeral or geometric figure might 
come next; for example, more complex tasks 
might include some problem-solving tasks 
(e.g., the water jugs problems; see Estes, 
1982). These problems were similar or iden- 
tical to those used on intelligence tests. 


Biological Approaches to Intelligence 


Although the human brain is clearly the 
organ responsible for human intelligence, 
early studies (e.g., those by Karl Lashley 
and others) seeking to find biological in- 
dices of intelligence and other aspects of 
mental processes were a resounding fail- 
ure despite great efforts. As tools for study- 
ing the brain have become more sophisti- 
cated, however, we are beginning to see the 
possibility of finding physiological indica- 
tors of intelligence. Some investigators (e.g., 
Matarazzo, 1992) believe that we will have 
clinically useful psychophysiological indices 
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millennium, although widely applicable in- 
dices will be much longer in coming. In the 
meantime, the biological studies we now 
have are largely correlational, showing sta- 
tistical associations between biological and 
psychometric or other measures of intelli- 
gence. The studies do not establish causal 
relations (see Goel, Chapter 20, for a de- 
scription of the neural basis of deductive 
reasoning). 


BRAIN SIZE 


One line of research looks at the relation- 
ship of brain size to intelligence (see Jerison, 

2000; Vernon et al., 2000). The evidence 
suggests that, for humans, there is a mod- 
est but significant statistical relationship 
between brain size and intelligence. It is dif- 
ficult to know what to make of this rela- 
tionship, however, because greater brain size 
may cause greater intelligence, greater intel- 
ligence may cause greater brain size, or both 
may depend on some third factor. Moreover, 
it probably is more important how efficiently 
the brain is used than what size it is. On aver- 
age, for example, men have larger brains than 
women, but women have better connections 
of the two hemispheres of the brain through 
the corpus callosum. So it is not clear which 
gender, on average, would be at an advan- 
tage, and probably neither would be. It is 
important to note that the relationship be- 
tween brain size and intelligence does not 
hold across species (Jerison, 2000). Rather, 
what holds seems to be a relationship be- 
tween intelligence and brain size relative to 
the rough general size of the organism. 


SPEED OF NEURAL CONDUCTION 


Complex patterns of electrical activity in 
the brain, which are prompted by specific 
stimuli, appear to correlate with scores on 
1Q tests (Barrett & Eysenck, 1992). Several 
studies (e.g., McGarry-Roberts, Stelmack, & 
Campbell, 1992; Vernon & Mori, 1992) ini- 
tially suggested that speed of conduction 
of neural impulses correlates with intelli- 
gence as measured by IQ tests. A follow- 
up study (Wickett & Vernon, 1994), how- 


neural-conduction velocity (as measured by 
neural-conduction speeds in a main nerve 
of the arm) and intelligence (as measured 
on the Multidimensional Aptitude Battery). 
Surprisingly, neural-conduction velocity ap- 
pears to be a more powerful predictor of 
IQ scores for men than for women, so gen- 
der differences may account for some of the 
differences in the data (Wickett & Vernon, 
1994). Additional studies on both males and 
females are needed. 


POSITRON EMISSION TOMOGRAPHY, FUNCTIONAL 
MAGNETIC RESONANCE IMAGING 

An alternative approach to studying the 
brain suggests that neural efficiency may be 
related to intelligence; such an approach is 
based on studies of how the brain metabo- 
lizes glucose (simple sugar required for brain 
activity) during mental activities. Richard 
Haier and colleagues (Haier et al., 1992) 
cited several other researchers who support 
their own findings that higher intelligence 
correlates with reduced levels of glucose 
metabolism during problem-solving tasks — 
that is, smarter brains consume less sugar 
(and hence expend less effort) than do less 
smart brains doing the same task. Further- 
more, Haier and colleagues found that cere- 
bral efficiency increases as a result of learning 
on a relatively complex task involving visu- 
ospatial manipulations (the computer game 
Tetris). As a result of practice, more in- 
telligent participants show not only lower 
cerebral glucose metabolism overall but also 
more specifically localized metabolism of 
glucose. In most areas of their brains, smarter 
participants show less glucose metabolism, 
but in selected areas of their brains (believed 
to be important to the task at hand), they 
show higher levels of glucose metabolism. 
Thus, more intelligent participants may have 
learned how to use their brains more effi- 
ciently to focus their thought processes on a 
given task. 

More recent research by Haier and col- 
leagues suggests that the relationship be- 
tween glucose metabolism and intelligence 
may be more complex (Haier et al., 1995; 
Larson et al., 1995). Whereas Haier’s group 
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creased glucose metabolism in less smart par- 
ticipants (in this case, mildly retarded par- 
ticipants), the study by Larson et al. (1995) 
found, contrary to the earlier findings, that 
smarter participants had increased glucose 
metabolism relative to their average compar- 
ison group. 

One problem with earlier studies is that 
the tasks used were not matched for diffi- 
culty level across groups of smart and av- 
erage individuals. The Larson et al. study 
used tasks that were matched to the ability 
levels of the smarter and average partici- 
pants and found that the smarter partici- 
pants used more glucose. Moreover, the glu- 
cose metabolism was highest in the right 
hemisphere of the more intelligent partic- 
ipants performing the hard task - again 
suggesting selectivity of brain areas. What 
could be driving the increases in glucose 
metabolism? Currently, the key factor ap- 
pears to be subjective task difficulty with 
smarter participants in earlier studies simply 
finding the tasks too easy. Matching task dif- 
ficulty to participants’ abilities seems to in- 
dicate that smarter participants increase glu- 
cose metabolism when the task demands it. 
The preliminary findings in this area need to 
be investigated further before any conclusive 
answers are reached. 

Some neuropsychological research (eg., 
Dempster, 1991) suggests that performance 
on intelligence tests may not indicate a cru- 
cial aspect of intelligence — the ability to 
set goals, to plan how to meet them, and 
to execute those plans. Specifically, persons 
with lesions in the frontal lobe of the brain 
frequently perform quite well on standard- 
ized IQ tests, which require responses to 
questions within a highly structured situa- 
tion, but do not require much in the way 
of goal setting or planning. If intelligence 
involves the ability to learn from experi- 
ence and to adapt to the surrounding envi- 
ronment, the ability to set goals and to de- 
sign and implement plans cannot be ignored. 
An essential aspect of goal setting and plan- 
ning is the ability to attend appropriately to 
relevant stimuli and to ignore or discount 
irrelevant stimuli. 


Some theorists have tried to understand 
intelligence in terms of how it has 
evolved over the eons (e.g., Bjorklund & 
Kipp, 2002; Bradshaw, 2002; Byrne, 2002; 
Calvin, 2002; Corballis, 2002; Cosmides & 
Tooby, 2002; Flanagan, Hardcastle, & Nah- 
mias, 2002; Grossman & Kaufman, 2002; 
Pinker, 1997). The basic idea in these mod- 
els is that we are intelligent in the ways 
we are because it was important for our 
distant ancestors to acquire certain sets of 
skills. According to Cosmides and Tooby 
(2002), for example, we are particularly sen- 
sitive at detecting cheating because people in 
the past who were not sensitive to cheaters 
did not live to have children, or had fewer 
children. Evolutionary approaches stress the 
continuity of the nature of intelligence over 
long stretches of time, and in some theo- 
ries, across species. However, during evolu- 
tion, the frontal lobe increased in size, so it 
is difficult to know whether changes in in- 
telligence are just a manifestation of physio- 
logical changes or the other way around. 


Contextual Approaches to Intelligence 


According to contextualists, intelligence 
cannot be understood outside its real-world 
context. The context of intelligence may 
be viewed at any level of analysis, focus- 
ing narrowly, on the home and family en- 
vironment, or extending broadly, on entire 
cultures (see Greenfield, Chap. 27). Even 
cross-community differences have been cor- 
related with differences in performance on 
intelligence tests; such context-related dif- 
ferences include those of rural versus urban 
communities, low versus high proportions 
of teenagers to adults within communities, 
and low versus high socioeconomic status 
of communities (see Coon, Carey, & Fulker, 
1992). Contextualists are particularly in- 
trigued by the effects of cultural context 
on intelligence. 

In fact, contextualists consider intelli- 
gence so inextricably linked to culture that 
they view intelligence as something that 
a culture creates to define the nature of 
adaptive performance in that culture and to 
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ter than others e on ake tasks that the culture 
happens to value (Sternberg, 1985). Theo- 
rists who endorse this model study just how 
intelligence relates to the external world in 
which the model is being applied and eval- 
uated. In general, definitions and theories of 
intelligence will more effectively encompass 
cultural diversity by broadening in scope. Be- 
fore exploring some of the contextual the- 
ories of intelligence, we will look at what 
prompted psychologists to believe that cul- 
ture might play a role in how we define and 
assess intelligence. 

People in different cultures may have 
quite different ideas of what it means to be 
smart. One of the more interesting cross- 
cultural studies of intelligence was per- 
formed by Michael Cole and colleagues 
(Cole et al., 1971). These investigators asked 
adult members of the Kpelle tribe in Africa 
to sort concept terms. In Western culture, 
when adults are given a sorting task on an in- 
telligence test, more intelligent people typ- 
ically sort hierarchically. For example, they 
may sort names of different kinds of fish to- 
gether, and then the word fish over that, with 
the name animal over fish and over birds, and 
so on. Less intelligent people typically sort 
functionally. They may sort fish with eat, for 
example, because we eat fish, or clothes with 
wear, because we wear clothes. The Kpelle 
sorted functionally — even after investigators 
unsuccessfully tried to get the Kpelle sponta- 
neously to sort hierarchically. Finally, in des- 
peration, one of the experimenters (Glick) 
asked a Kpelle to sort as a foolish person 
would sort. In response, the Kpelle quickly 
and easily sorted hierarchically. The Kpelle 
had been able to sort this way all along; they 
just hadn’t done it because they viewed it 
as foolish — and they probably considered 
the questioners rather unintelligent for ask- 
ing such stupid questions. 

The Kpelle people are not the only ones 
who might question Western understand- 
ings of intelligence. In the Puluwat culture of 
the Pacific Ocean, for example, sailors navi- 
gate incredibly long distances, using none of 
the navigational aids that sailors from tech- 
nologically advanced countries would need 
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1970). Were Puluwat sailors to devise intel- 
ligence tests for us and our fellow Amer- 
icans, we might not seem very intelligent. 
Similarly, the highly skilled Puluwat sailors 
might not do well on American-crafted tests 
of intelligence. These and other observations 
have prompted quite a few theoreticians to 
recognize the importance of considering cul- 
tural context when assessing intelligence. 

The preceding arguments may make it 
clear why it is so difficult to come up with 
a test that everyone would consider culture- 
fair — equally appropriate and fair for mem- 
bers of all cultures. If members of differ- 
ent cultures have different ideas of what it 
means to be intelligent, then the very be- 
haviors that may be considered intelligent in 
one culture may be considered unintelligent 
in another. Take, for example, the concept of 
mental quickness. In mainstream American 
culture, quickness is usually associated with 
intelligence. To say someone is “quick” is to 
say that the person is intelligent and, indeed, 
most group tests of intelligence are quite 
strictly timed. Even on individual tests of 
intelligence, the test-giver times some re- 
sponses of the test-taker. Many information- 
processing theorists and even psychophys- 
iological theorists focus on the study of 
intelligence as a function of mental speed. 

In many cultures of the world, people 
believe that more intelligent people do not 
rush into things. Even in our own culture, 
no one will view you as brilliant if you de- 
cide on a marital partner, a job, or a place 
to live in the 20 to 30 seconds you might 
normally have to solve an intelligence-test 
problem. Thus, given that there exist no 
perfectly culture-fair tests of intelligence, at 
least at present, how should we consider 
context when assessing and understanding 
intelligence? 

Several researchers have suggested that 
providing culture-relevant tests is possi- 
ble (e.g., Baltes, Dittmann-Kohli, & Dixon, 
1984; Jenkins, 1979; Keating, 1984); that 
is, tests that employ skills and knowledge 
that relate to the cultural experiences of 
the test-takers. Baltes and his colleagues, for 
example, designed tests measuring skill in 
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eryday life. Designing culture-relevant tests 
requires creativity and effort but probably 
is not impossible. A study by Daniel Wag- 
ner (1978), for example, investigated mem- 
ory abilities — one aspect of intelligence as 
our culture defines it - in our culture ver- 
sus the Moroccan culture. Wagner found 
that level of recall depended on the content 
that was being remembered, with culture- 
relevant content being remembered more 
effectively than irrelevant content (eg., 
compared with Westerners, Moroccan rug 
merchants were better able to recall com- 
plex visual patterns on black-and-white pho- 
tos of Oriental rugs). Wagner further sug- 
gested that when tests are not designed to 
minimize the effects of cultural differences, 
the key to culture-specific differences in 
memory might be the knowledge and use 
of metamemory strategies, rather than ac- 
tual structural differences in memory (e.g., 
memory span and rates of forgetting). 

In Kenya, research has shown that ru- 
ral Kenyan school children have substantial 
knowledge about natural herbal medicines 
they believe fight infection; Western chil- 
dren, of course, would not be able to 
identify any of these medicines (Sternberg 
et al., 2001; Sternberg & Grigorenko, 1997). 
In short, making a test culturally rele- 
vant appears to involve much more than 
just removing specific linguistic barriers 
to understanding. 

Stephen Ceci (Ceci & Roazzi, 1994) 
found similar context effects in childrens’ 
and adults’ performance ona variety of tasks. 
Ceci suggests that the social context (e.g., 
whether a task is considered masculine or 
feminine), the mental context (e.g., whether 
a visuo-spatial task involves buying a home 
or burgling it), and the physical context (eg., 
whether a task is presented at the beach or 
in a laboratory) all affect performance. For 
example, fourteen-year-old boys performed 
poorly on a task when it was couched as 
a cupcake-baking task but performed well 
when it was framed as a battery-charging 
task (Ceci & Bronfenbrenner, 1985). Brazil- 
ian maids had no difficulty with propor 
tional reasoning when hypothetically pur 


it when hypothetically purchasing medici- 
nal herbs (Schliemann & Magalhiies, 1990). 
Brazilian children whose poverty had forced 
them to become street vendors showed no 
difficulty in performing complex arithmetic 
computations when selling things but had 
great difficulty performing similar calcula- 
tions in a classroom (Carraher, Carraher, & 
Schliemann, 1985). Thus, test performance 
may be affected by the context in which 
the test terms are presented. In this study, 
the investigators looked at the interaction 
of cognition and context. Several investiga- 
tors have proposed theories that seek explic- 
itly to examine this interaction within an in- 
tegrated model of many aspects of intelli- 
gence. Such theories view intelligence as a 
complex system. 


Systems Approaches to Intelligence 


GARDNER: MULTIPLE INTELLIGENCES 


Howard Gardner (1983, 1993) proposed a 
theory of multiple intelligences, in which in- 
telligence is not just a single, unitary con- 
struct. Instead of speaking of multiple abil- 
ities that together constitute intelligence 
(e.g., Thurstone, 1938), Gardner (1999) 
speaks of eight distinct intelligences that are 
relatively independent of each other. Each 
is a separate system of functioning, although 
these systems can interact to produce what 
we see as intelligent performance. 

In some respects, Gardner’s theory 
sounds like a factorial one because it specifies 
several abilities that are construed to reflect 
intelligence of some sort. However, Gardner 
views each ability as a separate intelligence, 
not just as a part of a single whole. Moreover, 
a crucial difference between Gardner’s the- 
ory and factorial ones is in the sources of evi- 
dence Gardner used for identifying the eight 
intelligences. Gardner used converging op- 
erations, gathering evidence from multiple 
sources and types of data. 

Gardner’s view of the mind is modular, 
Because as a major task of existing and fu- 
ture research on intelligence is to isolate the 
portions of the brain responsible for each 
of the intelligences. Gardner has speculated 
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hard evidence for the existence of these sep- 
arate intelligences has yet to be produced. 
Furthermore, Nettelbeck and Young (1996) 
question the strict modularity of Gardner's 
theory. Specifically, the phenomenon of pre- 
served specific cognitive functioning in autis- 
tic savants (persons with severe social and 
cognitive deficits, but with corresponding 
high ability in a narrow domain) as evidence 
for modular intelligences may not be justi- 
fied. According to Nettelbeck and Young, 
the narrow long-term memory and specific 
aptitudes of savants is not really intelligent. 
As a result, there may be reason to question 
the intelligence of inflexible modules. 


STERNBERG: THE TRIARCHIC THEORY OF 
SUCCESSFUL INTELLIGENCE 

Whereas Gardner emphasizes the separate- 
ness of the various aspects of intelligence, I 
tend to emphasize the extent to which they 
work together in the triarchic theory of suc- 
cessful intelligence (Sternberg, 1985, 1988, 
1996, 1999). According to the triarchic (tri-, 
“three”; -archic, “governed”) theory, intelli- 
gence comprises three aspects, dealing with 
the relation of intelligence (1) to the internal 
world of the person, (2) to experience, and 
(3) to the external world. 

How intelligence relates to the internal 
world. This part of the theory emphasizes 
the processing of information, which can be 
viewed in terms of three different kinds of 
components: (1) metacomponents — execu- 
tive processes (i.e., metacognition) used to 
plan, monitor, and evaluate problem solving; 
(2) performance components — lower order 
processes used to implement the commands 
of the metacomponents; and (3) knowledge- 
acquisition components — the processes used 
to learn how to solve the problems in 
the first place. The components are highly 
interdependent. 

How intelligence relates to experience. The 
theory also considers how prior experi- 
ence may interact with all three kinds of 
information-processing components. That 
is, each of us faces tasks and situations with 
which we have varying levels of experience, 
ranging from a completely novel task, with 


a completely familiar task, with which we 
have vast, extensive experience. As a task 
becomes increasingly familiar, many aspects 
of the task may become automatic, requir- 
ing little conscious effort to determine what 
step to take next and how to implement 
that next step. A novel task makes demands 
on intelligence different from those of a 
task for which automatic procedures have 
been developed. 

According to the triarchic theory, rela- 
tively novel tasks — such as visiting a foreign 
country, mastering a new subject, or acquir- 
ing a foreign language — demand more of a 
person’s intelligence. In fact, a completely 
unfamiliar task may demand so much of the 
person as to be overwhelming. 

How intelligence relates to the external 
world. The triarchic theory also proposes 
that the various components of intelli- 
gence are applied to experience to serve 
three functions in real-world contexts — 
adapting ourselves to our existing environ- 
ments, shaping our existing environments 
to create new environments, and selecting 
new environments. 

According to the triarchic theory, people 
may apply their intelligence to many differ- 
ent kinds of problems. Some people may 
be more intelligent in the face of abstract, 
academic problems, for example, whereas 
others may be more intelligent in the face 
of concrete, practical problems. The the- 
ory does not define an intelligent person as 
someone who necessarily excels in all as- 
pects of intelligence. Rather, intelligent per- 
sons know their own strengths and weak- 
nesses and find ways in which to capitalize 
on their strengths and either to compensate 
for or to correct their weaknesses. 

In a recent comprehensive study testing 
the validity of the triarchic theory and its 
usefulness in improving performance, we 
predicted that matching students’ instruc- 
tion and assessment to their abilities would 
lead to improved performance (Sternberg 
et al., 1996, 1999). Students were selected 
for one of five ability patterns: high only in 
analytical ability, high only in creative abil- 
ity, high only in practical ability, high in all 
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abilities. Then students were assigned at ran- 
dom to one of four instructional groups that 
emphasized memory-based, analytical, cre- 
ative, or practical learning followed by sub- 
sequent assessment. We found that students 
who were placed in an instructional condi- 
tion that matched their strength in terms of 
ability pattern (e.g., a high-analytical student 
being placed in an instructional condition 
that emphasized analytical thinking) out- 
performed students who were mismatched 
(eg., a high-analytical student being placed 
in an instructional condition that empha- 
sized practical thinking). 

Teaching all students to use all of their 
analytic, creative, and practical abilities has 
resulted in improved school achievement 
for all students, whatever their ability pat- 
tern (Grigorenko, Jarvin, & Sternberg, 2002; 
Sternberg, Torff, & Grigorenko, 1998). One 
important consideration in light of such find- 
ings is the need for changes in the assess- 
ment of intelligence (Sternberg & Kaufman, 
1996). Current measures of intelligence are 
somewhat one-sided, measuring mostly an- 
alytic abilities with little or no assessment 
of creative and practical aspects of intel- 
ligence (Sternberg et al., 2000; Wagner, 
2000). A well-rounded assessment and in- 
struction system could lead to greater ben- 
efits of education for a wider variety of stu- 
dents — a nominal goal of education. 


TRUE INTELLIGENCE 


Perkins (1995) proposed a theory of what he 
refers to as true intelligence, which he believes 
synthesizes classic views as well as new ones. 
According to Perkins, there are three basic 
aspects of intelligence — neural, experiential, 
and reflective. 

Neural intelligence concerns what Perkins 
believes to be the fact that some people’s 
neurological systems function better than do 
the neurological systems of others, running 
faster and with more precision. He men- 
tions “more finely tuned voltages” and “more 
exquisitely adapted chemical catalysts” as 
well as a “better pattern of connecticity in 
the labyrinth of neurons” (Perkins, 1995, 


what any of these terms means. Perkins 
believes this aspect of intelligence to be 
largely genetically determined and unlearn- 
able. This kind of intelligence seems to be 
somewhat similar to Cattell’s (1971) idea of 
fluid intelligence. 

The experiential aspect of intelligence is 
what has been learned from experience. It is 
the extent and organization of the knowl- 
edge base and thus is similar to Cattell’s 
(1971) notion of crystallized intelligence. 

The reflective aspect of intelligence refers 
to the role of strategies in memory and prob- 
lem solving, and appears to be similar to 
the construct of metacognition or cogni- 
tive monitoring (Brown & DeLoache, 1978; 
Flavell, 1981). 

No empirical test of the theory of true 
intelligence has been published, so it is diffi- 
cult to evaluate the theory at this time. Like 
Gardner’s (1983) theory, Perkins’s theory is 
based on literature review, and, as noted pre- 
viously, such literature reviews often tend to 
be selective and then interpreted in a way 
that maximizes the fit of the theory to the 
available data. 


THE BIOECOLOGICAL MODEL OF INTELLIGENCE 


Ceci (1996) proposed a bioecological model 
of intelligence, according to which multi- 
ple cognitive potentials, context, and knowl- 
edge all are essential bases of individual 
differences in performance. Each of the mul- 
tiple cognitive potentials enables relation- 
ships to be discovered, thoughts to be moni- 
tored, and knowledge to be acquired within 
a given domain. Although these potentials 
are biologically based, their development is 
closely linked to environmental context, and 
it is difficult, if not impossible, to cleanly 
separate biological from environmental 
contributions to intelligence. Moreover, abil- 
ities may express themselves very differ- 
ently in different contexts. For example, chil- 
dren given essentially the same task in the 
context of a video game versus a labora- 
tory cognitive task performed much better 
when the task was presented in the video 
game context. 
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ways more to be a framework than a theory. 
Atsome level, the theory must be right. Cer- 
tainly, both biological and ecological factors 
contribute to the development and manifes- 
tation of intelligence. Perhaps what the the- 
ory needs most at this time are specific and 
clearly falsifiable predictions that would set 
it apart from other theories. 


Improving Intelligence 


Although designers of artificial intelligence 
have made great strides in creating programs 
that simulate knowledge and skill acquisi- 
tion, no existing program even approaches 
the ability of the human brain to enhance 
its own intelligence. Human intelligence is 
highly malleable and can be shaped and 
even increased through various kinds of in- 
terventions (Detterman & Sternberg, 1982; 
Grotzer & Perkins, 2000; Perkins & Grotzer, 
1997; Sternberg et al., 1996; Sternberg et al., 
1997; see Ritchhart & Perkins, Chap. 32, for 
a review of work on teaching thinking skills). 
Moreover, the malleability of intelligence 
has nothing to do with the extent to which 
intelligence has a genetic basis (Sternberg, 
1997). An attribute (such as height) can be 
partly or even largely genetically based and 
yet be environmentally malleable. 

The Head Start program was initiated 
in the 1960s to provide preschoolers with 
an edge on intellectual abilities and accom- 
plishments when they started school. Long- 
term follow-ups have indicated that by mid- 
adolescence, children who participated in 
the program were more than a grade ahead 
of matched controls who did not receive the 
program (Lazar & Darlington, 1982; Zigler 
& Berman, 1983). The children in the pro- 
gram also scored higher on a variety of tests 
of scholastic achievement, were less likely to 
need remedial attention, and were less likely 
to show behavioral problems. Although such 
measures are not truly measures of intelli- 
gence, they show strong positive correlations 
with intelligence tests. 

An alternative to intellectual enrichment 
outside the home may be to provide an en- 


successful project has been the Abecedar- 
ian Project, which showed that the cogni- 
tive skills and achievements of lower socioe- 
conomic status children could be increased 
through carefully planned and executed in- 
terventions (Ramey & Ramey, 2000). 

Bradley and Caldwell (1984) found sup- 
port for the importance of home environ- 
ment with regard to the development of 
intelligence in young children. These re- 
searchers found that several factors in the 
early (preschool) home environment were 
correlated with high IQ scores — emotional 
and verbal responsivity of the primary care- 
giver and the caregiver’s involvement with 
the child, avoidance of restriction and pun- 
ishment, organization of the physical envi- 
ronment and activity schedule, provision of 
appropriate play materials, and opportuni- 
ties for variety in daily stimulation. Further, 
Bradley and Caldwell found that these fac- 
tors more effectively predicted IQ scores 
than did socioeconomic status or family- 
structure variables. It should be noted, how- 
ever, that the Bradley—Caldwell study is 
correlational and therefore cannot be inter- 
preted as indicating causality. Furthermore, 
their study pertained to preschool children, 
and children’s IQ scores do not begin to pre- 
dict adult IQ scores well until age four years. 
Moreover, before age seven years, the scores 
are not very stable (Bloom, 1964). More re- 
cent work (e.g., Pianta & Egeland, 1994) sug- 
gested that factors such as maternal social 
support and interactive behavior may play a 
key role in the instability of scores on tests 
of intellectual ability between ages two and 
eight years. 

The Bradley and Caldwell data should not 
be taken to indicate that demographic vari- 
ables have little effect on IQ scores. To the 
contrary, throughout history and across cul- 
tures, many groups of people have been as- 
signed pariah status as inferior members of 
the social order. Across cultures, these dis- 
advantaged groups (e.g., native Maoris vs. 
European New Zealanders) have shown dif- 
ferences in tests of intelligence and apti- 
tude (Steele, 1990; Zeidner, 1990). Such was 
the case of the Burakumin tanners in Japan, 
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but not full acceptance into Japanese society. 
Despite their poor performance and under- 
privileged status in Japan, those who immi- 
grate to America and are treated like other 
Japanese immigrants — perform on IQ tests 
and in school achievement at a level compa- 
rable to that of their fellow Japanese Amer- 
icans (Ogbu, 1986). 

Similar positive effects of integration 
were shown on the other side of the world. 
In Israel, the children of European Jews score 
much higher on IQ tests than do children of 
Arabic Jews — except when the children are 
reared on kibbutzim in which the children 
of all national ancestries are raised by spe- 
cially trained caregivers in a dwelling sepa- 
rate from their parents. When these children 
shared the same child-rearing environments, 
there were no national-ancestry-related dif- 
ferences in IQ. 

Altogether, there is now abundant ev- 
idence that people’s environments (eg., 
Ceci, Nightingale, & Baker, 1992; Reed, 
1993; Sternberg & Wagner, 1994; Wagner, 
2000), their motivation (e.g., Collier, 1994; 
Sternberg & Ruzgis, 1994), and their train- 
ing (e.g., Feuerstein, 1980; Sternberg, 1987) 
can profoundly affect their intellectual skills. 
Thus, the controversial claims made by Her- 
rnstein and Murray (1994) in their book, The 
Bell Curve, regarding the futility of interven- 
tion programs, are unfounded when one con- 
siders the evidence in favor of the possibility 
of improving cognitive skills. Likewise, Her- 
rnstein and Murray’s appeal to “a genetic fac- 
tor in cognitive ethnic differences” (Herrn- 
stein & Murray, 1994, p. 270) falls apart in 
light of the direct evidence against such ge- 
netic differences (Sternberg, 1996) and re- 
sults from a misunderstanding of the heri- 
tability of traits in general. 

Heredity certainly plays a role in indi- 
vidual differences in intelligence (Loehlin, 
2000; Loehlin, Horn, & Willerman, 1997; 
Plomin, 1997), as does the environment 
(Grigorenko, 2000, 2002; Sternberg & Grig- 
orenko, 1999; Wahlsten & Gottlieb, 1997). 
Genetic inheritance may set some kind of 
upper limit on how intelligent a person may 
become. However, we now know that for 


reaction range — that is, the attribute can be 
expressed in various ways within broad lim- 
its of possibilities. Thus, each person’s intel- 
ligence can be developed further within this 
broad range of potential intelligence (Grig- 
orenko, 2000). We have no reason to believe 
that people now reach their upper limits in 
the development of their intellectual skills. 
To the contrary, the evidence suggests that 
we can do quite a bit to help people become 
more intelligent (for further discussion of 
these issues, see R. Mayer, 2000, and Neisser 
et al., 1996). 

Environmental as well as hereditary fac- 
tors may contribute to retardation in intelli- 
gence (Grigorenko, 2000; Sternberg & Grig- 
orenko, 1997). Environmental influences be- 
fore birth may cause permanent retardation, 
which may result from a mother’s inade- 
quate nutrition or ingestion of toxins such 
as alcohol during the infant’s prenatal devel- 
opment (Grantham-McGregor, Ani, & Fer- 
nald, 2002; Mayes & Fahy, 2001; Olson, 
1994), for example. Among the other en- 
vironmental factors that can negatively im- 
pact intelligence are low social and eco- 
nomic status (Ogbu & Stern, 2001; Seifer, 
2001), high levels of pollutants (Bellinger & 
Adams, 2001), inadequate care in the fam- 
ily or divorce (Fiese, 2001; Guidubaldi & 
Duckworth, 2001), infectious diseases (Al- 
cock & Bundy, 2001), high levels of radiation 
(Grigorenko, 2001), and inadequate school- 
ing (Christian, Bachnan, & Morrison, 2001). 
Physical trauma can injure the brain, causing 
mental retardation. 


Conclusions and Future Directions 


In conclusion, many approaches have been 
taken to improve understanding of the na- 
ture of intelligence. Great progress has been 
made in elaborating the construct but much 
less progress in converging upon either a 
definition or a universally accepted theory. 
Much of current debate revolves around 
trying to figure out what the construct is 
and how it relates to other constructs, such 
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ligence can be measured, to some extent, 
and it can be improved. Improvements are 
not likely to eliminate individual differences, 
however, because attempts to improve intel- 
ligence can help people at all levels and with 
diverse kinds of intelligence. No matter how 
high one’s intelligence, there is always room 
for improvement; and no matter how low, 
there are always measures that can be taken 
to help raise it. 
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CHAPTER 32 


Learning to Think: The Challenges 
of Teaching Thinking 


Ron Ritchhart 
David N. Perkins 


The idea that thinking can be taught, or at 
least productively nurtured along its way, is 
ancient. Beginning with the efforts of Plato 
and the introduction of Socratic dialog, we 
see attention to improving intelligence and 
promoting effective thinking as a recurring 
educational trend throughout the ages. Early 
in the twentieth century, Dewey (1933) 
again focused North American’s attention 
on the importance of thinking as an educa- 
tional aim. At the same time, Selz (1935) 
was advocating the idea of learnable intel- 
ligence in Europe. In the 1970s and 1980s, 
specific programs designed to teach think- 
ing took shape, many of which continue in 
schools today. Efforts to teach thinking have 
proliferated in the new millennium, often 
becoming less programmatic in nature and 
more integrated within the fabric of schools. 

Despite this long history of concern with 
thinking, one reasonably might ask: Why do 
we need to “teach” thinking anyway? After 
all, given reasonable access to a rich cultural 
surround, individuals readily engage in sit- 
uated problem solving, observing, classify- 
ing, organizing, informal theory building and 
testing, and so on, without much prompt- 


ing or even support. Indeed, neurological 
findings suggest that the brain is hard-wired 
for just such activities as a basic mechanism 
for facilitating language development, so- 
cialization, and general environmental sur- 
vival. Furthermore, it might be assumed 
that these basic thinking skills are already 
enhanced through the regular processes of 
schooling, as students encounter the work 
of past thinkers, engage in some debate, 
write essays, and so on. Why, then, should 
we concern ourselves with the teaching and 
learning of thinking? Addressing these is- 
sues entails looking more closely at a fuller 
range of thinking, particularly what might 
be called high-end thinking, as well as ex- 
amining the role education plays in promot- 
ing thinking. 

Although it is true that the human mind 
comes readily equipped for a wide variety of 
thinking tasks, it is equally true that some 
kinds of thinking run against these natural 
tendencies. For example, probabilistic think- 
ing is often counterintuitive in nature or 
doesn’t fit well with our experience (Tversky 
& Kahneman,i993; also see Kahneman & 
Frederick, Chap. 12). We have a natural 
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tion and interests — my-side bias (Molden & 
Higgins, Chap. 13) — that can lead to poor 
conclusions in decision making and discern- 
ments of truth (Baron, et al. 1993). We 
frequently draw conclusions and inferences 
based on limited evidence (Perkins, 1989, 
1995). The fundamental attribution error 
(Harvey, Town, & Yarkin, 1981) names the 
tendency, particularly in Westerners, to as- 
cribe characterological traits to others based 
on limited but highly salient encounters. 

Furthermore, sometimes our natural ways 
of making sense of the world actually stand 
in the way of more effective ways of think- 
ing. For instance, our ability to focus at- 
tention can lead to narrowness of vision 
and insight. Our natural tendency to detect 
familiar patterns and classify the world can 
lock us into rigid patterns of action and 
trap us in the categories we invent (Langer, 
1989). Relatedly, already developed under- 
standings constitute systems of knowledge 
that are much more readily extended than 
displaced: We tend to dismiss or recast chal- 
lenges rather than rethinking our under 
standings, which is a deep and general 
problem of learning (see Chi and Ohlsson, 
Chap. 16). Our emotional responses to 
situations can easily override more de- 
liberative thinking (Goleman, 1995). The 
phenomenon of groupthink, in which the 
dominant views of the group are readily 
adopted by group members, can lead to lim- 
ited processing and discernment of infor 
mation (Janis, 1972). These are just a few 
thinking shortfalls suggesting that truly good 
thinking does not automatically develop in 
the natural course of events. 

Even when our native tendencies do not 
lead us astray, they can usually benefit from 
development. The curiosity of the child for 
discovering and making sense of the world 
does not automatically evolve into an intel- 
lectual curiosity for ideas, knowledge, and 
problem solving (Dewey, 1933), for exam- 
ple. Our ability to see patterns and rela- 
tionships forms the basis for inductive rea- 
soning (see Sloman & Lagnado, Chap. 5), 
but the latter requires a level of precision 
and articulation that must be learned. Our 


much more sophisticated through system- 
atized processes of reasoning with evidence, 
weighing evidentiary sources, and drawing 
justifiable conclusions. Indeed, for most 
thinking abilities that might be considered 
naturally occurring, one can usually identify 
a more sophisticated form that such think- 
ing might take with some deliberate nurtur- 
ing. This type of thinking is what is often 
referred to as high-end thinking or criti- 
cal and creative thinking. Such thinking ex- 
tends beyond a natural processing of the 
world into the realm of deliberative thinking 
acts aimed at solving problems, making de- 
cisions (see LeBoeuf & Shafir, Chap. 11), and 
forming conclusions. 

The contribution of schooling to the de- 
velopment of thinking is a vexed matter (see 
Greenfield, Chap. 27, for a cross-cultural 
perspective on the impact of schooling). On 
the one hand, it is clear that schooling en- 
hances performance of various kinds on for- 
mal tasks and IQ-like instruments (Grotzer 
& Perkins, 2000; Perkins, 1985; see Stern- 
berg, Chap. 31, for a discussion of intelli- 
gence). For the most part, however, schools 
have addressed knowledge and skill acqui- 
sition. The narrowness of this focus and 
absence of strong efforts to nurture think- 
ing were criticized by Dewey at the turn 
of the century. Such critiques have contin- 
ued until today from a variety of sources. In 
a series of empirical investigations, Perkins 
and colleagues (Perkins, Allen, & Hafner, 
1983; Perkins, Faraday & Busheq, 1991) in- 
vestigated the impact of conventional ed- 
ucation at the high school, university, and 
graduate school levels on informal reasoning 
about everyday issues. Cross-sectional stud- 
ies examining the impact of three years of 
high school, college, and graduate school re- 
vealed only marginal gains (Perkins, 1985). 
Several national reports on schooling in the 
1980s discussed how schools were domi- 
nated by rote work and involved very lit- 
tle thinking (Boyer, 1983; National Com- 
mission on Excellence in Education, 1983; 
Goodlad, 1983). 

The problems of overcoming thinking 
shortfalls while enhancing native thinking 
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stitute an important rationale for the ex- 
plicit teaching of thinking. Furthermore, as 
knowledge and information become at the 
same time more complex and more acces- 
sible, critics argue that teaching thinking 
should be considered even more of a pri- 
ority (Resnick, 1987). In this setting, it is 
not enough to simply consume predigested 
knowledge, one must also become a knowl- 
edge builder (Scardamalia, Bereiter, & La- 
mon, 1994) and problem solver (Polya, 1957; 
Schoenfeld, 1982; Selz, 1935). 

This need for thinking instruction has led 
to a rapid increase in efforts to teach thinking 
over the past thirty years. During this time, a 
few well-established thinking programs have 
taken hold in schools and sustained their de- 
velopment, while a plethora of new pro- 
grams, often small interventions based on 
current cognitive theory, have flourished. 
In addition, an increasing array of subject- 
based programs and designed learning en- 
vironments aimed at developing students’ 
thinking also have emerged. These programs 
deal with many different aspects of think- 
ing, including critical and creative thinking 
(for more on creative thinking, see Sternberg 
et al. Chap. 15), reflective and metacognitive 
thinking, self-regulation, decision-making, 
and problem solving, as well as disciplinary 
forms of thinking. 

All of these programs — whether aimed 
at developing thinking as part of a stand- 
alone course within the context of teach- 
ing a particular subject or as part of a larger 
design of the instructional environment — 
confront at least five important challenges 
in their efforts to develop thinking. We use 
these as the basis for the present review. 
The first challenge relates to the bottom 
line: Can thinking be taught with some rea- 
sonable signs of success? The second chal- 
lenge concerns what is meant when one talks 
about good thinking. Programs and efforts to 
teach thinking are shaped largely by the an- 
swer to this question. The third challenge 
deals with the dispositional side of think- 
ing, not just skills and processes but atti- 
tudes and intellectual character (Ritchhart 
2002; Tishman 1994). The fourth challenge 


the teaching of thinking. We conclude with 
a fifth challenge, that of creating cultures of 
thinking, in which we examine the social 
context and environment in which think- 
ing is being promoted. Each of these chal- 
lenges involves key philosophical and prac- 
tical issues that all efforts to teach thinking, 
whether undertaken by a single teacher or 
a major research university, must confront. 
We review the ways in which various efforts 
to teach thinking address these challenges 
to clarify just what is involved in teaching 
thinking. 


The Challenge of Attaining Results 


As is the case with any class of educational 
interventions, one of the most fundamental 
questions to be asked is: Do they work - 
at least with some populations under some 
circumstances? This is especially important 
for an area like the teaching of thinking, 
which is haunted by skepticism on the part 
of lay people and some scholars. 

It may seem premature to turn to findings 
without discussing details about background 
theories and issues in the field, but letting 
the question of impact hover for many pages 
while we deal with such matters also seems 
troublesome. After all, if there isn’t at least 
some indication that thinking can be taught, 
then the remaining challenges become aca- 
demic. Accordingly, we turn to this ultimate 
challenge first, asking whether, at least some- 
times, coordinated efforts to teach thinking 
work in a reasonable sense, also taking it as 
an opportunity to put quick profiles of sev- 
eral interventions on the table to give readers 
a feel for the range of approaches. 

In looking for success, it is helpful to bear 
in mind three broad criteria — magnitude, 
persistence, and transfer (Grotzer & Perkins, 
2000). An intervention appears successful 
to the extent that it shows some magni- 
tude of impact on learners’ thinking with 
effects that persist well beyond the period 
of instruction and with transfer to other 
contexts and occasions. Previous reviewers 
of thinking programs pointed out that the 
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effectiveness is often hard to come by in 
the research literature (e.g., Adams, 1989; 
Nickerson, Perkins, & Smith, 1985; Stern- 
berg, 1986), often because of the lack of 
funding for careful long-term program eval- 
uation. We emphatically do not limit this ar- 
ticle only to those programs receiving exten- 
sive evaluation, but we do focus this section 
on a few such programs. The good news is 
that the history of efforts to teach thinking 
provides proofs for achieving all three crite- 
ria, at least to some extent. 

Programs designed to teach thinking 
come in many different styles. For instance, 
some programs are designed to develop dis- 
crete skills and processes such as classifica- 
tion and sequencing as means of developing 
the building blocks for thinking. Paul (1984) 
refers to these programs as “micrological” in 
nature. They often find their theoretical jus- 
tification in theories of intelligence (see next 
section for more on how various programs 
define good thinking), and they often use de- 
contextualized and abstract materials similar 
to those one might find on standardized psy- 
chometric tests. 

Perhaps the best-known program of this 
type is Instrumental Enrichment (IE) (Feuer- 
stein, 1980). It uses very abstract, test-like 
activities to develop skills in areas such 
as comparisons, categorization, syllogisms, 
and numerical progressions, among others. 
Instructors are encouraged to “bridge” the 
abstract exercises by relating the skills to 
world problem solving. Instrumental en- 
richment was designed to bring students 
who show marked ability deficits into main- 
stream culture, although it can be used with 
other students as well. 

In one study, matched samples of 
low functioning, low socio-economic status 
(SES) twelve- to fifteen-year-olds partici- 
pated in IE or general enrichment (GE) pro- 
grams providing direct help, such as math 
or science tutoring. Instrumental enrich- 
ment subjects made greater pre- to post- 
test gains on tests of interpersonal con- 
duct, self-sufficiency, and adaptation to work 
demands. Instrumental enrichment subjects 
scored slightly above normal, far better than 


better than GE subjects by about a third of 
a standard deviation on incidental follow- 
up testing on an Army Intelligence test 
(DAPAR) two years later (Feuerstein et al., 
1981; Rand, Tannenbaum, & Feuerstein, 
1979). These findings show both magnitude 
and persistence of effects, with some trans- 
fer. The program uses testlike activities, so 
the transfer to a nonverbal intelligence test 
might be considered a case of near trans- 
fer (Perkins & Salomon, 1988). Evidence of 
transfer to school tasks — far transfer — seems 
to depend on the individual teacher or in- 
structor, who is responsible for providing the 
bridging (Savell, Twohig, & Rachford, 1986; 
Sternberg, 1986). 

These findings have proved less easily 
replicated with students of average or above- 
average ability. What is consistent, however, 
is the change in behavior and attitude stu- 
dents experience, generally in terms of in- 
creased confidence in abilities and a more 
positive attitude toward school work (Blagg, 
1991; Kriegler, 1993). 

Another type of program to teach think- 
ing tends to be more “macrological” in na- 
ture (Paul, 1984), being contextualized and 
real world oriented, focusing on more broad- 
based skills such as considering multiple 
points of view, dealing with complex in- 
formation or creative problem solving. Phi- 
losophy for Children (Lipman, 1976), and 
CoRT (Cognitive Research Trust) (de Bono, 
1973), are examples of this approach. The 
Philosophy for Children program engages 
students in philosophical discussions around 
a shared book to cultivate students’ abil- 
ity to draw inferences, make analogies, form 
hypotheses, and so forth. The CoRT pro- 
gram teaches a collection of thinking “op- 
erations,” defined by acronyms for creative 
and critical thinking; operations these aim 
to broaden and organize thinking and fa- 
cilitate dealing with information. Through 
a developed set of practice problems, for 
instance, students learn to apply the PMI 
operation (plus, minus, interesting), iden- 
tifying the pluses, minuses, and interesting 
but otherwise neutral points about a matter 


at hand. 
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long enough to develop a strong base and 
avid followers, resulting in a wealth of anec- 
dotal evidence and reports of effectiveness. 
Indeed, observers of these programs tend to 
be impressed with the involvement of stu- 
dents and the level of thinking demonstrated 
(Adams, 1989). Furthermore, some evidence 
can be found to support both programs. Ed- 
wards (1994) reports that twelve-year-olds 
taught all sixty lessons of the CoRT pro- 
gram showed improved scores on quantita- 
tive as well as qualitative measures. Com- 
pared with other seventh grade students, 
scores of CoRT students ranged from 48% to 
62% above the national mean on standard- 
ized tests, whereas other seventh graders’ 
scores ranged from 25% to 43% above the 
national norm of 31%, indicating a mag- 
nitude effect. Teachers reported improve- 
ments in student thinking and confidence. 
Although students reported using the skills 
in other areas of their lives, there was no for- 
mal measure of transfer on this evaluation. 
Other evaluations revealed mixed results on 
transfer (Edwards & Baldauf, 1983, 1987). 
The program produces an interesting find- 
ing with respect to persistence that should 
be noted. Although reviews of research on 
CoRT suggest that the effects were short- 
term (Edwards, 1991a, 1991b), it was found 
that a small amount of follow-up reinforce- 
ment given in the two years after the inter- 
vention resulted in increased persistence of 
effects with scores that were one-third better 
than controls three years after the interven- 
tion (Edwards, 1994). 

With respect to Philosophy for Children, 
evaluations have shown that children in 
grades four to eight display significant gains 
in reading comprehension or logical think- 
ing (Lipman, 1983). Transfer is built into the 
program because the discussions are text- 
based and consequently deepen comprehen- 
sion while teaching and modeling thinking 
strategies within the real world contexts of 
the stories. As Adams (1989, p. 37) points 
out, the texts give “Lipman the freedom to 
introduce, reintroduce, and elaborate each 
logical process across a diversity of real- 
world situations.” 


unique hybrid. The Odyssey (Adams, 1986) 
program developed through a collaboration 
between Harvard Project Zero, Bolt Beranek 
and Newman, Inc., and the Venezuela Min- 
istry of Education was specifically designed 
to systematically build macrological skills 
upon micrological skills. The first lessons of 
the program deal with micrological skills, or 
what the program developers call first-order 
processes of classification, hierarchical clas- 
sification, sequencing, and analogical reason- 
ing, to build the foundation for the macro- 
logical process of dimensional analysis. 
Processes often are introduced in the ab- 
stract, but then application is made to varied 
contexts. The program takes the form of a 
separate course with 100 lessons, but it seeks 
to connect directly to the scholastic activi- 
ties of students and provide links to everyday 
life as well. The Odyssey program has been 
evaluated only in Venezuela. In a relatively 
large evaluation of the program involving 
roughly goo students in control and exper- 
imental groups across twenty-two seventh 
grade classes, the group gains of the exper- 
imental group were 117 percent more than 
that of the control group on course-designed 
pre- and postmeasurements — a strong indi- 
cator of magnitude of effects. A battery of 
tests were used to assess for transfer, includ- 
ing those of general ability, word problems, 
and nonverbal reasoning. All showed signif- 
icant gains for the experimental group, indi- 
cating both magnitude and transfer of effects 
(Herrnstein, et al., 1986). 

The abovementioned programs, whether 
focusing on micrological or macrological 
skills, were stand-alone interventions with 
perhaps a modest degree of integration. A 
number of programs are fully integrated and 
connected to the curriculum. A few of these 
are Intuitive Math (Burke, 1971) and Problem 
Solving and Comprehension (Whimbey and 
Lochhead, 1979), both focused on mathe- 
matics, and Think (Adams, 1971) and Re- 
ciprocal Teaching (Brown & Palincsar, 1982), 
which are focused on language arts and read- 
ing. All of these programs are designed to 
connect thinking processes to specific school 
content to enhance student understanding 
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cus on skills such as classification, structure 
analysis, and seeing analogies. Problem Solv- 
ing and Comprehension uses a technique 
called “paired problem solving” to develop 
metacognitive awareness of one’s thinking 
during problem solving. Reciprocal Teaching 
is not so much a program as an approach to 
teaching reading comprehension. Through a 
dialog with the teacher, students engage in 
cycles of summarizing, question generating, 
clarifying, and predicting. All of these inter- 
ventions have been shown to produce im- 
pressive results for their target populations, 
generally low-achieving students, within the 
domains of their focus. In addition, transfer 
effects have been documented for Intuitive 
Math and Think (Worsham & Austin, 1983; 
Zenke & Alexander, 1984). 

As promised, these examples — and oth- 
ers discussed later — offer a kind of exis- 
tence proof regarding the challenge of attain- 
ing results (more reviews of these and other 
thinking programs can be found in Adams, 
1989; Grotzer & Perkins, 2000; Hamers & 
Overtoom, 1997; Idol, 1991; McGuinness & 
Nisbet, 1991; Nickerson et al., 1985; Perkins, 
1995; Sternberg, 1986). They give evidence 
that instruction designed to improve learn- 
ers’ thinking can advance it, with persistent 
impact, and with some degree of transfer to 
other contexts and occasions. Along the way, 
they also illustrate how rather different ap- 
proaches can serve this purpose. 

This is not to say that such results demon- 
strate overwhelming success. Impacts on 
learners’ thinking are typically moderate 
rather than huge. The persistence of effects 
tapers off after a period of months or years, 
particularly when learners return to settings 
that do not support the kind of development 
in question. Transfer effects are often spotty 
rather than sweeping. These limitations are 
signs that the grandest ambitions regarding 
the teaching of thinking are yet to be real- 
ized. That said, enough evidence is at hand 
to show that the prospects of teaching think- 
ing cannot simply be dismissed on theoreti- 
cal or empirical grounds. This opens the way 
for a deeper consideration of the challenges 
of doing so in the upcoming sections. 


Good Thinking 


Any program that aspires to teach think- 
ing needs to face the challenge of defining 
good thinking, not necessarily in any ulti- 
mate and comprehensive sense but at least in 
some practical, operational sense. With the 
foregoing examples of programs in mind, it 
will come as no surprise that many differ- 
ent approaches have been taken to answer 
this challenge. 

To begin, it is useful to examine some gen- 
eral notions about the nature of good think- 
ing. There are a number of very broad char- 
acterizations. Folk notions of intelligence, in 
contrast with technical notions, boil down 
to good thinking. A number of years ago, 
Sternberg et al. (1981) reported research 
synthesizing the characteristics people en- 
vision when they think of someone as in- 
telligent. Intelligent individuals reason sys- 
tematically, solve problems well, think in a 
logical way, deploy a good vocabulary, make 
use of a rich stock of information, remain fo- 
cused on their goals, and display intelligence 
in practical as well as academic ways. Perkins 
(1995) summed up a range of research on 
difficulties of thinking by noting the human 
tendency to think in ways that are hasty (im- 
pulsive, insufficient investment in deep pro- 
cessing and examining alternatives), narrow 
(failure to challenge assumptions, examine 
other points of view), fuzzy (careless, im- 
precise, full of conflations), and sprawling 
(general disorganization, failure to advance 
or conclude). Baron (i985) advanced a 
search-and-inference framework that em- 
phasized effective search and inference 
around forming beliefs, making decisions, 
and choosing goals. Ennis (1986) offered a 
list of critical thinking abilities and disposi- 
tions, including traits such as seeking and of- 
fering reasons, seeking alternatives, and be- 
ing open-minded. There are many others 
as well. 

The overlap among such conceptions is 
apparent. They can be very useful for a broad 
overview and for the top level of program 
design, but they are not virtues of thinking 
that learners can straightforwardly learn or 
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good theory of action (e.g., Argyris, 1993; 
Argyris & Schén, 1996) that would guide 
and advise learners about how to improve 
their thinking, or guide and advise teachers 
and program designers about how to culti- 
vate thinking. With this general challenge 
in mind, we turn to describing three ap- 
proaches through which researchers and ed- 
ucators have constructed theories of action 
that characterize good thinking — by way of 
norms and heuristics, models of intelligence, 
and models of human development. 


Norms and Heuristics 


One common approach to defining good 
thinking is to characterize concepts, stan- 
dards, and cognitive strategies that serve 
a particular kind of thinking well. These 
guide performance as norms and heuristics. 
When people know the norms and heuris- 
tics, they can strive to improve their practice 
accordingly. The result is a kind of “craft” 
conception: Good thinking is a matter of 
mastering knowledge, skills, and habits ap- 
propriate to the kind of thinking in question 
as guided by the norms and heuristics. 

Norms provide criteria of adequacy for 
products of thinking such as arguments or 
grounded decisions. Examples of norms in- 
clude suitable conditions for formal deduc- 
tion or statistical adequacy, formal (e.g., af- 
firming the consequent) or informal (e.g., ad 
hominem argument) fallacies to be avoided, 
or maximized payoffs in game theory (Ham- 
blin, 1970; Nisbett, 1993; Voss, Perkins, & 
Segal, 1991). Heuristics guide the process of 
thinking, but without the guarantees of suc- 
cess that an algorithm provides. For instance, 
mathematical problem solvers often do well 
to examine specific cases before attempting 
a general proof or to solve a simpler related 
problem before tackling the principal prob- 
lem (Polya, 1954, 1957). 

The norms and heuristics approach fig- 
ures widely in educational endeavors. Train- 
ing in norms of argument goes back at least 
to the Greek rhetoriticians (Hamblin, 1970) 
and continues in numerous settings of for- 
mal education today with many available 


and taught for many generic thinking prac- 
tices — everyday decision making, problem 
solving, evaluating of claims, creative think- 
ing, and so on. 

Looking to programs mentioned earlier 
for examples, we note that the CoRT pro- 
gram teaches “operations” such as PMI (con- 
sider plus, minus, and interesting factors in a 
situation) and OPV (consider other points 
of view) (de Bono, 1973). The Odyssey 
program teaches strategies for decision- 
making, problem solving, and creative de- 
sign, among others, foregrounding familiar 
strategies such as looking for options be- 
yond the obvious, trial and error, and ar- 
ticulation of purposes (Adams, 1986). Polya 
(1954, 1957) offered a well-known analysis 
of strategies for mathematical problem solv- 
ing, including examining special cases, ad- 
dressing a simplified form of the problem 
first, and many others. This led to a num- 
ber of efforts to teach mathematical prob- 
lem solving, with unimpressive results, until 
Schoenfeld (1982; Schoenfeld & Herrmann, 
1982) demonstrated a very effective inter- 
vention that included the instructor’s work- 
ing problems while commenting on strate- 
gies as they were deployed, plus emphasis 
on the students’ self-management of the 
problem-solving process. Many simple read- 
ing strategies have been shown to improve 
student retention and understanding when 
systematically applied, including, for exam- 
ple, the previously mentioned “reciprocal 
teaching” framework in which young read- 
ers interact conversationally in small groups 
around a text to question, clarify, summa- 
rize, and predict (Brown & Palincsar, 1982). 

Nisbett (1993) reported a series of stud- 
ies conducted by himself and colleagues 
about the effectiveness of teaching norms 
and heuristics of statistical, if-then, cost- 
benefit, and other sorts of reasoning, mainly 
to college students. Nisbett concluded that 
instruction in rules of reasoning was consid- 
erably more effective than critics of general, 
context-free rules for reasoning had claimed. 
To be sure, student performance displayed 
a range of lapses and could have been bet- 
ter. Nonetheless, students often applied the 


782 THE CAMBRIDGE HANDBOOK OF THINKING AND REASONING 


patterns of rexdwentattch yh HttpeviGeiipnary@msider general intelligence in the sense 


ing quite widely, well beyond the content 
foregrounded in the instruction. Relatively 
abstract and concise formulations of princi- 
ple alone led to some practical use of rules 
for reasoning, and this improved when in- 
struction included rich exploration of exam- 
ples. Nisbett emphasized that we could cer- 
tainly teach rules for reasoning much better 
than we do. Nonetheless, the basic enterprise 
appeared to be sound. 

To summarize, the characteristic peda- 
gogy of the approach through norms and 
heuristics follows from its emphasis on 
thinkers’ theories of action. Programs of this 
sort typically introduce norms and heuris- 
tics directly, demonstrate their application, 
and engage learners in practice with a range 
of problems, often with an emphasis on 
metacognitive awareness, self-management, 
and reflection on the strategies, general char- 
acter, and challenges of thinking. 

Readily grasped concepts and standards, 
strategies with three or four steps, and the 
like characterize the majority of norms and 
heuristics approaches. One objection to such 
simplicity is that it can seem simpleminded. 
“Everyone knows” that people should con- 
sider both sides of the case in reasoning or 
look for options beyond the obvious. How- 
ever, as emphasized in the introduction to 
this article, such lapses are commonplace. 
Everyone does not know, and those who 
do know often fail to do so. The point of 
norms and heuristics most often is not to re- 
veal novel or startling secrets of a particular 
kind of thinking but to articulate some ba- 
sics and help bridge from inert knowledge to 
active practice. 


Models of Intelligence 


The norms and heuristics approach to defin- 
ing and cultivating good thinking may be the 
most common, but another avenue looks di- 
rectly to models of intelligence (see Stern- 
berg, Chap. 31). Not so often encoun- 
tered in the teaching of thinking is good 
thinking defined through classic intelligence 
quotient (IQ) theory. On the one hand, 
many, although by no means all, scholars 


of Spearman’s g factor to be unmodifiable 
by direct instructional interventions (Brody, 
1992; Jensen, 1980, 1998). On the other 
hand, a single factor does not afford much of 
a theory of action, because it does not break 
down the learning problem into components 
that can be addressed systematically. 
Models of intelligence with components 
offer more toward a theory of action. 
J. P. Guilford’s 1967 (Guilford & Hoepfner, 
1971) Structure of Intellect (SOI) model, 
for example, proposes that intelligence in- 
volves no fewer than 150 different com- 
ponents generated by a three-dimensional 
analysis involving several cognitive opera- 
tions (cognition, memory, evaluation, con- 
vergent production, divergent production) 
crossed with several kinds of content (be- 
havioral, visual figural, and more) and cog- 
nitive products (units, classes, relations, 
and more). An intervention developed by 
Meeker (1969) aims to enhance the func- 
tioning of a key subset of these compo- 
nents. Feuerstein (1980) argues that in- 
telligence is modifiable through mediated 
learning (with a mediator scaffolding learn- 
ers on the right kinds of tasks). His In- 
strumental Enrichment program offers a 
broad range of mediated activities orga- 
nized around three broad categories of cog- 
nitive process — information input, elabora- 
tion, and output — to work against problems 
such as blurred and sweeping perception, 
impulsiveness, unsystematic hypothesis test- 
ing, and egocentric communication. 
Sternberg (1985) developed the triarchic 
theory of intelligence over anumber of years, 
featuring three dimensions of intelligence — 
analytic (as in typical IQ tests), practical 
(expert “streetwise” behavior in particular 
domains), and creative (invention, innova- 
tion). Sternberg, et al. (1996) report an in- 
tervention based on Sternberg’s (1985) tri- 
archic theory of intelligence: High school 
students taking an intensive summer col- 
lege course were grouped by their strengths 
according to Sternberg’s three dimensions 
and taught the same content in ways build- 
ing on their strengths. The study included 
other groups not matched with their 
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rior performance. 

The typical pedagogy of interventions 
based on models of intelligence empha- 
sizes not teaching norms and heuristics but 
rather providing abundant experience with 
the thinking processes in question in moti- 
vated contexts with strong emphasis on at- 
tention and self-regulation. Often, although 
by no means always — the Sternberg inter- 
vention is an exception here, for example — 
the tasks have a rather abstract character 
on the theory that the learning activities 
are enhancing the functioning of fundamen- 
tal cognitive operations and content is best 
selected for minimal dependence on back- 
ground knowledge. That said, it is impor- 
tant to recognize that no matter what the 
underlying theory — norms and heuristics, 
intelligence-based, or developmental, as in 
the following section — interventions often 
pragmatically combine a variety of methods 
rather than proceeding in a purist manner. 


Models of Human Development 


Another approach to defining good think- 
ing looks to models of human development 
that outline how cognitive development nor- 
mally advances, often through some se- 
quence of stages that represent plateaus 
in the complexity of cognition, as with 
the classic concrete and formal operational 
stages of Inhelder and Piaget (1958; see Hal- 
ford, Chap. 22). For example, the program 
called Cognitive Acceleration through Sci- 
ence Education (CASE) (Adey & Shayer, 
1993, 1994) teaches patterns of thinking in 
science — for instance the isolation and con- 
trol of variables — based on Piagetian princi- 
ples of uncovering students’ prior concep- 
tions and creating opportunities for them 
to reorganize their thinking. Lessons intro- 
duce cognitive dissonance around particular 
puzzles so students are led to examine their 
assumptions and rethink their prior con- 
ceptions. In addition to the thinking skills, 
the program focuses explicitly on fostering 
metacognition and transferring knowledge 
and strategies between contexts. A formal 


control students on school science achieve- 
ment tests with delayed posttesting. For 
some groups, substantial and statistically 
significant differences emerged for science, 
mathematics, and English performance two 
years after participation in CASE, demon- 
strating magnitude, persistence, and transfer 
of impact, the criteria used in the foregoing 
results section (Adey & Shayer, 1994, p. 92). 

Although this example takes a stage-like 
view of human development, another tra- 
dition looks to the work of Vygotsky and 
his followers, seeing development more as 
a process of internalization from social situ- 
ations that scaffold for the thinking of the 
participant (1978). In addition to its Pi- 
agetian emphasis, the work of Adey and 
Shayer draws upon social scaffolding. Scar- 
damalia and colleagues developed an initia- 
tive initially called CSILE (Computer Sup- 
ported Intentional Learning Environments) 
and now Knowledge Forum, that engages 
students in the collaborative construction of 
knowledge through an online environment 
that permits building complex knowledge 
structures and labels for many important 
epistemic elements such as hypotheses and 
evidence (Scardamalia, et al., 1989). The 
social character of the enterprise and the 
forms of discourse it externalizes through 
the online environment create conditions for 
Vygotskian internalization of patterns of 
thinking. Studies of impact have shown gains 
in students’ depth of explanation and knowl- 
edge representation, capability in dealing 
with difficult texts, recall of more infor- 
mation from texts, and deeper conceptions 
of the nature of learning with more of a 
mastery emphasis (Scardamalia, Bereiter, & 
Lamon, 1994). 

Of course, developmental psychology has 
evolved greatly since the days of Vygotsky 
and Piaget. For example, the past half 
century has seen development explained 
in terms of expansion in, and more effi- 
cient use of, working memory (e.g., Case, 
1985; Fischer, 1980; Pascual-Leone, 1978); 
semi-independent courses of development 
traced in different domains (eg., Case, 
1992; Fischer, 1980; Carey, 1985); strands of 
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of mind, with innate mental structures an- 
ticipating certain kinds of knowledge (e.g., 
Detterman, 1992; Hirschfeld & Gelman, 
1994), and so on. 

It is not the role of this chapter to review 
the complexities of contemporary develop- 
mental psychology, especially because as far 
as we know, few approaches to the teaching 
of thinking have based themselves on recent 
developmental theory. Quite likely, there 
are substantial opportunities that have not 
been taken. To give a sense of the promise, 
Case (1992) advanced the idea of central con- 
ceptual Structures, which are core structures 
in broad domains such as quantity, narra- 
tive, and intentionality that lie at the foun- 
dations of development in these domains 
and enable further learning. Working from 
this notion, Griffin, Case, and Capodilupo 
(1995) designed and assessed an interven- 
tion called Rightstart to develop the cen- 
tral conceptual structure for number and 
advance kindergarteners’ preparation for 
learning basic arithmetic operations through 
formal instruction. Testing demonstrated 
that the children in the treatment group 
indeed acquired a more fully developed 
central conceptual structure for number, 
displayed greater understanding of number 
in content areas not included in the train- 
ing, and responded with substantially greater 
gains to later formal instruction in the basics 
of arithmetic as well as showing far transfer 
to sight reading in music and to the notion of 
distributive justice, areas related to the cen- 
tral conceptual structure for number. 

As these examples illustrate, the general 
pedagogical style of the developmental ap- 
proach is to harness “natural” footholds and 
mechanisms of development to accelerate 
and perhaps reach levels that the learner 
otherwise would not attain. As theories of 
action, models of human development, like 
models of intelligence, do not so much offer 
strategic advice to learners as they address 
teachers and especially designers, suggesting 
how they might arrange activities and expe- 
riences that will push development forward. 
Indeed, a common, although questionable, 
tenet of much developmental theory is that 


logical structures. Learners must attain them 
by wrestling with the right kinds of problems 
under appropriately reflective and support- 
ive conditions. 


What Effect Does a Theory of Good 
Thinking Have? 


With approaches to defining good thinking 
through heuristic analysis, intelligence, and 
human development on the table, perhaps 
the most natural question to ask is which 
approach is “right” and therefore would lead 
to the most powerful interventions. Unfor- 
tunately, the matter is far too complex to 
declare a winner. One complication is that 
all programs, despite their theoretical differ- 
ences, share key features. All programs en- 
gage learners in challenging thinking tasks 
that stretch beyond what they normally un- 
dertake. All programs place some empha- 
sis on focused attention and metacognitive 
self-regulation. It may be that these de- 
mand characteristics are the factors that in- 
fluence an intervention’s success more than 
the underlying theory. Furthermore, as un- 
derscored earlier, programs are often eclectic 
in their means: Their methods overlap more 
than their philosophies. 

To further complicate declaring a win- 
ner, different programs speak to the distinc- 
tive needs of different audiences — children 
of marked disabilities with unsystematic 
and impulsive ways of thinking, students of 
elementary science conceptually confused 
about themes such as control of variables, 
math students in college struggling with 
strategies of proof, and so on. 

Another confounding factor is that a tech- 
nically well-grounded theory may not be 
that helpful as a theory of action. As noted 
earlier, this is a problem with classic g theory. 
Finally, and somewhat paradoxically, a the- 
ory, that is, in some ways suspect may lead 
to an intervention that proves quite effec- 
tive. For example, Piagetian theory has been 
challenged in a number of compelling ways 
(e.g., Brainerd, 1983; Case, 1984, 1985), yet 
applying certain key aspects of it appears 
to serve the demonstrably effective CASE 
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1994), perhaps because the kinds of think- 
ing it foregrounds are important to complex 
cognition of the sort targeted, putting aside 
the standing of Piagetian theory as a whole. 

In summary, although approaches based 
on norms and heuristics, theories of intelli- 
gence, and models of development can be 
identified, it is difficult at present to dismiss 
any of them as misguided. As with much of 
human enterprise, the devil is in the details — 
here, the details of particular programs’ 
agendas, the learners they mean to serve, 
and the extent to which their conceptions 
of good thinking provide helpful theories 
of action. 

That said, there is a general limitation 
to all three approaches: They all concern 
what it is to think well when you are think- 
ing. Such criteria are certainly important, 
but this leaves room to ask: What if you 
don’t feel moved to think about the mat- 
ter at hand, or what if you don’t even no- 
tice that the circumstances invite thinking? 
This brings us to the next fundamental 
challenge of teaching thinking — the role 
of dispositions. 


The Challenge of Attending 
to Thinking Dispositions 


We discussed earlier how approaches to 
teaching thinking needed to address the 
question: What is good thinking? In a 
sense, that question was incomplete. Good 
thinkers, after all, are more than people who 
simply think well when they think: They also 
think at the right times with the right com- 
mitments — to truth and evidence, creativity 
and perspective taking, sound decisions, and 
apt solutions. Views of thinking that bring 
this to the fore are often called dispositional 
because they look not just to how well peo- 
ple think when trying hard but what kinds 
of thinking they are disposed to undertake. 
Most views of thinking are abilities- 
centered, but several scholars have devel- 
oped dispositional perspectives — for in- 
stance Dewey (1922), who wrote of habits 


inference framework; Ennis (1986) and 
Norris (1995) as part of analyses of criti- 
cal thinking; Langer (1989, p. 44), with the 
notion of mindfulness, which she defined as 
“an open, creative, and probabilistic state of 
mind”; and Facione et al. (1995). Models of 
self-regulation have emphasized volitional 
aspects of thinking and individuals’ moti- 
vation to engage thoughtfully (Schunk & 
Zimmerman, 1994). We and our colleagues 
have done extensive work in this area, re- 
ferring to intellectual character as a partic- 
ular perspective on dispositions (Ritchhart, 
2002; Tishman, 1994, 1995) and to disposi- 
tions themselves (Perkins, Jay, & Tishman, 
1993; Perkins et al., 2000; Perkins & 
Tishman, 2001; Perkins & Ritchhart, 2004). 

Accordingly, it is important to exam- 
ine the dispositional side of the story and 
appraise its importance in the teaching 


of thinking. 


The Logical Case for Dispositions 


One line of argument for the importance of 
dispositions looks to logic and common ex- 
perience. There is a natural tendency to as- 
sociate thinking with blatant occasions — the 
test item, the crossword puzzle, the choice 
of colleges, the investment decision. Plainly, 
however, many situations call for thinking 
with a softer voice all too easily unheard — 
the politician’s subtle neglect of an alterna- 
tive viewpoint, your own and others’ rea- 
soning from ethnic stereotypes, the comfort 
of “good enough” solutions that are not all 
that good. Even when we sense opportuni- 
ties for deeper thinking in principle, there 
are many reasons why we often shun them — 
blinding confidence in one’s own view, obliv- 
iousness to the possibilities for seeing things 
differently, aversion to complexities and am- 
biguities, and the like. Such lapses seem all 
too common, which is why, for example, 
Dewey (1922) emphasizes the importance 
of good habits of mind that can carry people 
past moments of distraction and reluctance. 
Scheffler (1991, p. 4), writing about cogni- 
tive emotions, put the point eloquently in 
stating that “emotion without cognition is 
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vacuous.” 

It also is notable that the everyday lan- 
guage of thinking includes a range of terms 
for positive and negative dispositional traits 
considered to be important: A person may be 
open-minded or closed-minded, curious or 
indifferent, judicious or impulsive, system- 
atic or careless, rational or irrational, gullible 
or skeptical. Such contrasts have more to 
do with how well people actually use their 
minds than how well their minds work. 


The Empirical Case for Dispositions 


The foregoing arguments from logic and 
common sense give some reason to view 
the dispositional side of thinking as im- 
portant. Beyond that, a number of re- 
searchers have investigated a range of dis- 
positional constructs and provided empirical 
evidence of their influence on thinking, their 
trait-like character, and their distinctness 
from abilities. 

Research on dispositional constructs such 
as the need for cognitive closure (Kruglanski, 
1990) and the need for cognition (describ- 
ing an individual’s tendency to seek, engage 
in, and enjoy cognitively effortful activity; 
Cacioppo & Petty, 1982) has shown that they 
influence when and to what extent individu- 
als engage in thinking and has demonstrated 
test-retest reliability (Kruglanski, 1990; 
Cacioppo et al., 1996). Measures of an 
individual’s need for cognition developed 
by Cacioppo and colleagues show that it 
is a construct distinguishable from ability 
(Cacioppo et al., 1996). 

Dweck and colleagues investigated an- 
other dispositional construct for a number 
of years — the contrast between entity learn- 
ers and incremental learners (Dweck, 1975, 
2000). Broadly speaking, learners with an 
entity mindset believe that “you either get 
it or you don’t,” and if you don’t, you prob- 
ably are not smart enough. As a result, they 
tend to quit in the face of intellectual chal- 
lenges. In contrast, learners with an incre- 
mental mindset believe their abilities can 
be extended through step-by-step effort, so 


search has shown that these traits are inde- 
pendent of cognitive abilities but often affect 
cognitive performance greatly. Also, teach- 
ing style and classroom culture can shape the 
extent to which students adopt entity versus 
incremental mindsets. 

Using self-report measures of dogmatism, 
categorical thinking, openness, counterfac- 
tual thinking, superstitious thinking, and ac- 
tively open-minded thinking, Stanovich and 
West (1997) found these measures predicted 
performance on tests of argument evalu- 
ation even after controlling for cognitive 
capacities. 

These studies support the notion that dis- 
positional constructs do influence behavior 
and can be useful in predicting performance, 
although perhaps not in any absolute sense. 
One can be curious in one situation and not 
in another, for instance. Likewise with dispo- 
sitions such as friendliness or skepticism. Al- 
though there is evidence for cross-situational 
stability for some dispositional constructs 
(Webster & Kruglanski, 1994), the value of 
the dispositional perspective does not rest 
on an assumed cross-situational character. 
Indeed, rather than acting in a top-down, 
trait-like fashion, dispositions offer a more 
bottom-up explanation of patterns of behav- 
ior consistent with emerging social-cognitive 
theories of personality (Cervone, 1999; 
Cervone & Shoda, 1999). A dispositional 
perspective takes into account both the situ- 
ational context and individual motivational 
factors, positing that patterns of behavior are 
emergent and not merely automatic. To bet- 
ter understand how such behavior emerges 
and how dispositions differ from traits, it is 
necessary to break apart dispositional behav- 
ior into its distinct components. 

For a number of years, the authors and 
their colleagues have sustained a line of re- 
search on the nature of dispositions, as cited 
earlier. Although most scholars view dispo- 
sitions as motivating thinking, we have ana- 
lyzed the dispositional side of thinking into 
two components — sensitivity and inclina- 
tion. Sensitivity does not motivate think- 
ing as such but concerns whether a person 
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events that might call for thinking, such as 
noticing a hasty causal inference, a sweeping 
generalization, a limiting assumption to be 
challenged, or a provocative problem to be 
solved. Inclination concerns whether a per- 
son is inclined to invest effort in thinking 
the matter through because of curiosity, per- 
sonal relevance, and so on. 

Our empirical research argues that sensi- 
tivity is supremely important. We used sto- 
ries that portrayed people thinking through 
various problems and decisions with embed- 
ded shortfalls in their thinking, such as not 
going beyond the obvious options or not ex- 
amining the other side of the case (Perkins 
et al., 2000; Perkins & Tishman, 2001). In 
multiple studies, we found that subjects 
detected only about 10% of the thinking 
problems, although, when prompted, they 
showed good ability, readily brainstorming 
further options or generating arguments on 
the other side of the case. Inclinations played 
an intermediate role in their engagement 
in thinking. 

In one study, we examined test-retest cor 
relations on sensitivity scores for detecting 
thinking shortfalls and found correlations of 
about 0.8 for a ninth grade sample and 0.6 
for a fifth grade sample. The findings provide 
evidence that sensitivity to the sorts of short- 
falls examined is a somewhat stable charac- 
teristic of the person. In several studies, we 
examined correlations between our disposi- 
tional measures and various measures of cog- 
nitive ability with results ranging from no to 
moderate correlation but lower than correla- 
tions within ability measures (Perkins et al., 
2000; Perkins & Tishman, 2001). The find- 
ings suggest that sensitivity and inclination 
are not simply reflections of cognitive ability 
as usually conceived: Dispositions are truly 
another side of the story of thinking. 


Cultivating Thinking Dispositions 


These lines of evidence support the funda- 
mental importance of dispositions in under- 
standing what it is to be a good thinker. The 
question remains what role attention to dis- 


teaching of thinking. Most programs do not 
attend directly and systematically to dispo- 
sitional aspects of thinking, although they 
may foster dispositions as a side-effect. In- 
deed, it is inconvenient to address disposi- 
tions through programs that focus on direct 
instruction and regimens of practice. The 
dispositional side of thinking concerns notic- 
ing when to engage thinking seriously, which 
inherently does not come up in abilities- 
centered instruction that point-blank directs 
students to think about this or that problem 
using this or that strategy. 

One solution to this suggests that cul- 
ture is the best teacher of dispositions (cf 
Dewey, 1922, 1933; Ritchhart, 2002; Tish- 
man, Jay, & Perkins, 1993; Tishman, Perkins, 
& Jay, 1995; Vygotsky, 1978). A culture in 
the classroom, the family, or the workplace 
that foregrounds values of thinking and en- 
courages attention to thinking would plausi- 
bly instill the attitudes and patterns of alert- 
ness called for. 

Interventions that wrap learners in a 
culture include the Philosophy for Chil- 
dren program developed by Lipman and 
colleagues (Lipman, 1988; Lipman, Sharp, 
& Oscanyon, 1980), which foregrounds 
Socratic discussion, and the online col- 
laborative knowledge-building environment 
CSILE (Scardamalia & Bereiter, 1996; 
Scardamalia et al., 1989, 1994), both of 
which were discussed earlier. Instrumental 
Enrichment (Feuerstein, 1980) involves a 
strong culture of support between media- 
tor and learners. We have also worked on 
programs with a cultural emphasis, includ- 
ing Keys to Thinking (Perkins, Tishman, & 
Goodrich, 1994; Cilliers et al., 1994) and one 
now under development (Perkins & Ritch- 
hart, 2004), and have published a book for 
teachers with this emphasis — The Thinking 
Classroom (Tishman, Perkins, & Jay, 1995). 
The theme of cultures of thinking is impor- 
tant in other ways as well, so, rather than 
elaborating further, we will return to it in a 
later section. 

It is reasonable to ask whether such in- 
terventions have been shown to enhance 
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nately, evidence on this question is sparse. 
Although most of these programs have 
been formally evaluated, the assessments 
by and large are abilities-oriented. Their 
performance-on-demand character does not 
estimate what students are disposed to do 
in the absence of explicit demands, which 
is what dispositions are all about. That 
acknowledged, it is worth recalling that 
CSILE students revealed deeper concep- 
tions of the nature of learning, a tendency 
to make mastery-oriented choices in their 
learning, and an avowed valuing of deep 
thinking (Scardamalia, Bereiter, & Lamon, 
1994). Low-ability students responding to 
IE show marked increases in self-confidence 
(Feuerstein et al., 1981; Rand, Tannenbaum, 
& Feuerstein, 1979). The authors think it 
likely that many programs have at least 
some impact on learners’ dispositions, but 
an extensive empirical case remains to 
be made. 

In summary, both folk psychology and 
a good deal of academic psychology give 
abilities center stage in explaining good and 
not-so-good thinking and thinkers. Along 
with this abilities-centered view of think- 
ing comes a concomitant view of what it 
is to teach thinking: To get people to think 
better and improve their abilities, teach 
problem-solving skills, learning skills, self- 
management skills, and so on. All this cer- 
tainly has value as far as it goes. However, the 
arguments advanced here question the com- 
pleteness of the storyline. They challenge 
whether performance-on-demand tasks are 
a good model of how thinking works in ev- 
eryday life and urge that well-rounded ef- 
forts to teach thinking attend to dispositional 
development as well as the development 
of abilities. 

As is the case with abilities development, 
dispositions need to be considered from the 
standpoint of transfer of learning. Not only 
skills, but dispositions need to be generalized 
broadly from their initial contexts of learn- 
ing for them to develop a robust nature. This 
brings us to our next challenge, that of teach- 
ing transfer. 


Like education in general, efforts to teach 
thinking do not simply target the here and 
now: They mean to serve the there and 
then. What learners acquire today in the 
way of thinking skills, strategies, cognitive 
schemata, underlying cognitive operations, 
dispositions, metacognitive capabilities, and 
the like aims to help them there and then 
make a difficult personal decision or study 
quantum physics or manage a business or 
draft and deliver a compelling political state- 
ment. In other words, the teaching of think- 
ing reaches for transfer of learning. Some- 
times the ambition for transfer is modest — 
experiences with reading for understanding 
or mathematical problem solving here and 
now should improve performance for the 
same activities later in other contexts. Not 
uncommonly, however, the ambition is far 
more grand — fundamental and far-reaching 
transformation of the person as a thinker. 

Some have charged that such ambitions 
are overwrought. Although thinking may be 
cultivated in particular contexts for partic- 
ular purposes, far-reaching transformation 
may be impossible. Relatedly, some have 
argued that it may be impossible to teach 
thinking in an abstract way — say, with 
puzzle-like problems and through stepwise 
strategies — with gains that will spread far 
and wide. 

Empirical research shows us that the 
prospects of transfer cannot be utterly bleak. 
In the second section of this article, we of- 
fered a number of existence proofs for mag- 
nitude, persistence, and transfer of impact, 
and more appeared in the subsequent sec- 
tion. Before looking further at such results, 
let us hear the case for meager transfer. At 
least three lines of scholarship pose a chal- 
lenge to transfer — research on transfer it- 
self, research on expertise and the role of 
knowledge in cognition, and research on sit- 
uated cognition. We will look briefly at each 
in turn. 

Transfer of learning has a vexed his- 
tory, particularly with respect to far transfer, 
a somewhat informal term for transfer 
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initial learning (see Holyoak, Chap. 6, for 
a review of work on transfer by use of 
analogies). We can touch only briefly on 
this complex literature. The classic studies 
are Thorndike’s (1923, Thorndike & Wood- 
worth, 1901) demonstrations that the intel- 
lectual rigor of studying Latin did not lead 
to improved performance on other fronts. 
Since that time, numerous reviews and com- 
pilations have shown that far transfer is hard 
to come by (e.g., Detterman, 1992; Detter- 
man & Sternberg, 1992; Salomon & Perkins, 
1989). For an interesting echo of Thorndike’s 
era, anumber of efforts in the 1980s to teach 
various versions of computer programming 
as, it might be said, “the new Latin,” gener 
ally showed no cognitive gains beyond the 
programming skills themselves (Salomon & 
Perkins, 1987). Thorndike’s view that trans- 
fer depended on “identical elements” and is 
less likely to apply to domains far removed 
from one another remains a tempting expla- 
nation of the difficulties. 

A more recent view in a somewhat sim- 
ilar spirit, Transfer Appropriate Processing, 
holds that the prospects of transfer de- 
pend on a match between the features fore- 
grounded during initial encoding and the 
kinds of features called for in the target con- 
text. Initial encoding may tie the learning 
to extraneous or unnecessarily narrow fea- 
tures of the situation, limiting the prospects 
of transfer to other situations that happen to 
share the same profile (Morris, Bransford, & 
Franks, 1977). Another rather different bar- 
rier reflects the position held by many IQ 
theorists that there is nothing to train and 
transfer: Very general cognitive capabilities 
simply are not subject to improvement by 
direct training, although genetics, nutrition, 
long-term enculturation by schooling, and 
other factors may influence general cogni- 
tive capability. 

Research directly on transfer aside, more 
damage to the prospects comes from studies 
of expertise and the importance of domain- 
specific knowledge. Although it might be 
thought that skilled cognition reflects gen- 
eral cognitive capabilities, an extensive body 
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portance of familiarity with the knowledge, 
strategies, values, challenges, and other fea- 
tures of particular disciplines and endeavors 
(eg., Bereiter & Scardamalia, 1993; Erics- 
son & Smith, 1991; Ericsson, 1996). For a 
classic example, de Groot (1965) and, build- 
ing on his work, Chase and Simon (1973) 
demonstrated that skillful chess play de- 
pends on a large repertoire of strategic pat- 
terns about chess specifically accessed in a 
perception-like way (see Novick & Bassok, 
Chap. 14). 

Evidence from a range of professions ar- 
gues that naturalistic decision-making de- 
pends on quick typing of situations to link 
them to prototypical solutions that can be 
adjusted to the immediate circumstances 
(Klein, 1999). In the same spirit, path anal- 
yses of performance in practical job con- 
texts has shown specific knowledge to be a 
much more direct predictor of performance 
than general intelligence (Hunter, 1986). 
Several scholars have argued that intelligent 
behavior is deeply context bound (e.g. Ceci, 
1990; Detterman, 1992b; Glaser, 1984; Lave, 
1988). Effective thinking depends so much 
on a large repertoire of reflexively activated, 
context-specific schemata that substantial 
transfer of expert thinking from one domain 
to another is impossible. Everyday support 
for this comes from the informal observation 
that people rarely manage to display high- 
level thinking in more than one field. 

Interventions consistent with this view in- 
clude programs in mathematics and science 
education that focus on a particular domain 
and try to advance learners’ expertise. For 
example, Schoenfeld and Herrmann (1982) 
documented how subjects in a previously 
mentioned experimental intervention based 
on heuristics became more expert-like in 
their mathematical problem solving, coding 
problems more in terms of their deep struc- 
ture than surface features. 

Further skepticism about the prospects 
for far transfer derives from studies of the 
situated character of cognition and learning 
(Brown, Collins, & Duguid, 1989; Kishner & 
Whitson, 1997; Lave, 1988; Lave & Wenger, 
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activity is socially and physically situated in 
particular contexts, depending for its flu- 
ency and depth on a web of interactions 
with peers, mentors, physical and symbolic 
tools, and so on. Skill and knowledge do 
not so much sit in the heads of individu- 
als as they are distributed through the social 
and physical setting (Salomon, 1993) and 
constituted through that setting. Individuals 
off-load certain thinking tasks onto the en- 
vironment by use of note-taking, organiza- 
tional mechanisms, fellow collaborators, and 
other technological tools to free up mental 
space for more complex forms of thinking 
(Pea, 1993). 

Accordingly, complex cognition is more 
likely to develop through “cognitive appren- 
ticeship” (Collins, Brown, & Newman, 1989) 
in the context of rich social and physical 
support than through instruction that at- 
tempts to teach abstract schemas. Within 
such environments, individuals may first par- 
ticipate on the periphery of the group or 
with high-levels of support and gradually 
progress to more independent and central 
forms of operation as their expertise and 
comfort level increases (Lave & Wenger, 
1991). Because cognition is so situated, the 
story goes, it is hard to uproot patterns of 
cognition and transplant them into very dif- 
ferent contexts where they can still thrive. 
Interventions consistent with this view in- 
clude, for example, the CSILE collabora- 
tive online knowledge building environment 
mentioned earlier (Scardamalia, Bereiter, & 
Lamon, 1994) and the Jasper Woodbury pro- 
gram, which helps youngsters build math- 
ematical skills and insights through situ- 
ating problem solving within compelling 
narratives and by making it a social endeavor 
(Van Haneghan et al., 1992). 

This triple challenge to the prospects of 
transfer seems daunting indeed. However, 
it is important to emphasize that these cri- 
tiques by and large address the prospects 
of far transfer. They allow ample room for 
CSILE, the Jasper Woodbury program, writ- 
ers’ workshops, design studios, philosophy 
classes and the like, where the aim is to get 
better at a particular kind of thinking. 


tise, and situated cognition just outlined 
have their critics as well as their propo- 
nents. Many moderate positions take the 
most severe implications of these views 
with a large grain of salt. For example, Sa- 
lomon and Perkins (1989) outlined a two- 
channel model of transfer specifying con- 
ditions for transfer by way of reflective 
abstraction and by way of automatization 
of routines, pointing out that there certainly 
were some successes reported in the trans- 
fer literature, and explaining a range of fail- 
ures by the absence of conditions that would 
support transfer along one channel or the 
other. In similar spirit, Gick and Holyoak 
(1980, 1983) (see Holyoak, Chap. 6) demon- 
strated effective transfer between quite dif- 
ferent problem-solving contexts when sub- 
jects spontaneously or upon prompting 
reflectively abstracted underlying principles. 
Bassok and Holyoak (1993) summarize ex- 
periments by making the case that super 
ficial content context was not as limiting 
as some had argued. In many cases, learn- 
ers bridged quite effectively from one con- 
tent context to another quite different, al- 
though mismatches in the character of key 
variables in source and target sometimes in- 
duced considerable interference. Bransford 
and Schwartz (1999) urged reframing the 
problem of transfer in terms of readier learn- 
ing in the future, not of direct gains in per- 
formance, arguing that this afforded ample 
opportunity for far transfer. 

Turning to the theme of expertise, it can 
be acknowledged that a rich collection of 
schemata constitutes an essential engine for 
high-level thinking in a domain. Although 
necessary in itself this engine is not suffi- 
cient. Expert status does not protect a person 
from blind spots such as failure to examine 
the other side of the case (Perkins, Farady, & 
Bushey, 1991). Indeed, people who “ought 
to know better” can behave with remark- 
able obtuseness (Sternberg, 2002). In keep- 
ing with this, many norms and heuristics 
for good thinking address not the complex 
knowledge characteristic of domain mas- 
tery but broad patterns of processing, such 
as engaging anomalies seriously, examining 
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tions, the neglect of which commonly en- 
traps even those with well-developed knowl- 
edge in a domain (see Chi and Ohlsson, 
Chap. 16). 

Moreover, expert thinking is misleading 
as a gold standard. Producing expert think- 
ing by no means is the sole aim of the 
teaching of thinking. In many contexts, good 
thinking needs to be understood not as 
good-for-an-expert but good-for-a-learner 
or good-for-an-amateur. Some scholars have 
observed that there seems to be such a thing 
as “expert novices,” and “expert learners” 
who bring to learning situations a range of at- 
titudes and strategies highly conducive to de- 
veloping expertise more quickly (Bereiter & 
Scardamalia, 1993; Brown, Ferrera, & Cam- 
pione, 1983; Bruer, 1993). Moreover, in 
many facets of complex modern life — con- 
sider filing income taxes, functioning as re- 
sponsible citizens, purchasing a new car or 
home — most of us operate as perpetual am- 
ateurs. We do not engage in such activities 
enough to build deep expertise. The ques- 
tion is less whether good general thinking 
enables us to behave like an expert — it does 
not — and more whether good general think- 
ing enables us to perform better than we oth- 
erwise would by leveraging more effectively 
what knowledge we do have and helping us 
to acquire more as we go. 

Turning to the related theme of situated 
knowledge, Anderson, Reder, and Simon 
(1996) identified four core claims character- 
istic of the situated position — that action 
is grounded in concrete situations, knowl- 
edge does not transfer between tasks, train- 
ing by abstraction is of little use, and in- 
struction must be done in complex social 
environments — and proceeded to summa- 
rize empirical evidence contrary to all of 
them as universal generalizations. Bereiter 
(1997) and Salomon and Perkins (1998) un- 
derscored how learners productively learn 
under many degrees and kinds of social re- 
lations and situatedness. Greeno, Smith, and 
Moore (i992) offered an account of trans- 
fer from the perspective of situated cogni- 
tion, explaining how people sometimes ex- 
port systems of activity to other superficially 


this is certainly not to argue the opposite — 
that transfer comes easily, expertise depends 
largely on general cognitive capabilities, and 
learning is not somewhat entangled in its par- 
ticular contexts — but rather to point out that 
the most dire readings of the prospects of 
transfer do not seem to be warranted. 

Although the foregoing treats the general 
debate, the evidence on transfer from efforts 
to teach thinking also warrants considera- 
tion. As cited earlier, Nisbett (1993) sum- 
marized a number of studies in which efforts 
to teach statistical, if-then, cost-benefit, and 
other sorts of reasoning had led to transfer 
across content domains. As emphasized un- 
der the first challenge we addressed, there 
is considerable evidence for persistent far 
transfer of improvements in thinking from 
a number of studies. The signs of such trans- 
fer include impact on general reading skills, 
1Q-like measures, thinking in various sub- 
ject matters, the general cognitive compe- 
tence of retarded people, and more. It will 
be recalled that the philosophies and meth- 
ods of these programs are quite diverse, with 
some using rather abstract tasks well re- 
moved from any particular subject matter or 
natural community. 

In summary, we suggest that the de- 
bate around transfer, expertise, and situated 
learning has been overly polarized and ideo- 
logical, leading to sweeping declarations on 
both sides regarding what is possible or im- 
possible that do not stand up to empiri- 
cal examination. The relationship between 
general cognitive structures and particular 
situations perhaps needs to be understood 
as more complex and dynamic. Perkins and 
Salomon (1989) offer the analogy of the 
human hand gripping something. The hu- 
man hand plainly is a very flexible gen- 
eral instrument, but it always functions in 
context, gripping different things in differ- 
ent ways. Moreover, we need to learn to 
grasp objects according to their affordances: 
You don’t hold a baby the same way you 
hold a brick. Likewise, one can acknowl- 
edge a broad range of general strategies, 
cognitive operations, and schemata with- 
out naively holding that they operate in 
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ways made — sometimes easily, sometimes 
with difficulty. Skilled cognition involves 
complex interarticulations of the general and 
the specific. 

So the prospects of transfer escape 
these skirmishes with skepticism — but not 
unscathed! Indeed, there are pointed lessons 
to be drawn. We can learn from research 
on the difficulties of transfer that transfer is 
nothing to take for granted. Well-designed 
efforts to cultivate thinking will face up 
to the challenge, for instance by incorpo- 
rating episodes of reflective abstraction to 
help learners to decontextualize patterns of 
thinking and by providing practice across 
multiple distinct contexts. Well-designed ef- 
forts to cultivate thinking will look closely 
at the behavior of experts to construct their 
heuristic analyses, and will not expect gen- 
eral norms and heuristics to do the job of 
norms and heuristics tailored to particular 
endeavors such as writing or mathemati- 
cal problem solving. Well-designed efforts 
to cultivate thinking will recognize the dis- 
tributed nature of cognition, and take advan- 
tage of social and physical support systems to 
advance individual and collective thinking. 


The Challenge of Creating Cultures 
of Thinking 


Thus far, we’ve examined four challenges 
that efforts to teach thinking traditionally 
have faced. As teachers and program devel- 
opers seek to meet those challenges, a host 
of additional concerns arise; for example; 
How do we provide enough time, context, 
and diverse applications so that new pat- 
terns of thinking actually take hold? How 
can we best take into account that school 
learning happens in a social context within a 
classroom among a group of individuals? Is 
the development of individual thinking best 
served and supported by the development 
of group learning practices? How do we un- 
cover the thinking that is going on in individ- 
uals and within the group so we can respond 
to it and learn from it? These questions con- 


challenge of creating cultures of thinking. 

Culture has been mentioned briefly 
in previous sections, but one still might 
ask: What is it about culture, and cul- 
tures of thinking in particular, that de- 
mands attention (see Greenfield, Chap. 27, 
for further discussions on the role of 
culture)? Three important motives are wor- 
thy of attention: First, the supporting struc- 
tures of culture are needed to sustain gains 
and actualize intelligent behavior over time, 
as opposed to merely building short-term 
capacity (Brown & Campione, 1994; Scar- 
damalia et al., 1994; Tishman, Perkins, & 
Jay, 1993). It is through the culture of the 
classroom that strategies and practices take 
on meaning and become connected to the 
work of learning. Second, culture helps to 
shape what we attend to, care about, and 
focus our energies upon (Bruner, Olver, 
& Greenfield, 1966; Dasen, 1977; Super, 
1980). Thus, culture is integrally linked to 
the dispositional side of thinking and to 
the cultivation of inclination and sensitivity. 
Third, researchers and program developers 
increasingly have recognized that thinking 
programs are not merely implemented but 
are enacted, developed, and sustained in a 
social context. As a result, they have found it 
necessary to move away from teacher-proof 
materials, which view learning as an isolated 
individual process, and toward approaches 
that pay more attention to the underlying 
conditions of learning. 

As a result of the awareness of the 
role culture plays in learning, the past two 
decades have seen efforts to teach think- 
ing shift from programmed strategy in- 
struction aimed at students as individu- 
als to broad-based approaches aimed at 
building classroom cultures supportive of 
the active social construction of knowl- 
edge among groups. These approaches take 
a variety of forms, such as cognitive ap- 
prenticeship (Collins, Brown, & Newman, 
1989), fostering a community of learners 
(Brown & Campione, 1994), group knowl- 
edge building (Bereiter & Scardamalia, 1996; 
Scardamalia & Bereiter, 1996), inquiry- 
based teaching (Lipman, 1983), and the 
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man, Perkins, & Jay, 1995) and habits of mind 
(Costa & Kallick, 2002). Several programs 
associated with these approaches were men- 
tioned previously — CISLE/Knowledge Fo- 
rum, Philosophy for Children, and Keys to 
Thinking among them. We’ll examine a few 
additional ones subsequently. Before doing 
so, however, it may be useful to take a closer 
look at just what is meant by culture in the 
cultural approach. 

Culture, construed broadly, refers to the 
context and general surround in which we 
operate. This doesn’t tell us much about 
what it means to become enculturated, how- 
ever. To illuminate this issue it is helpful to 
look at particular intellectual subcultures or 
communities of practice, say of mathemati- 
cians or writers or even mechanics. What 
does it mean to be a part of these cultures? 
A frame that we have found useful is based 
on two top-level conceptions: resources and 
practice (Roth, 1995). Resources are the 
things upon which members of the culture 
of practice draw when they do their work. 
Resources can be physical in nature: com- 
puters, books, instruments, tools, and the 
like. There are also social resources such as 
colleagues, coworkers, editors, peer-review 
boards, and so on. These types of resources 
help distribute cognition outside the individ- 
ual thinker’s mind (Salomon, 1993). In ad- 
dition, there are conceptual resources con- 
sisting of the conceptual, knowledge, and 
belief systems in which the subculture read- 
ily traffics. Also included in the conceptual 
resources are the symbol systems and nota- 
tional structures evolved to support abstract 
thought (Gardner, 1983; Goodman, 1976; 
Olson, 1974). 

Practice captures the constructive acts en- 
gaged in by the cultural group — what it is 
they do, the kind of work that is valued and 
rewarded, the methods they employ. This 
connects the group to the socio-historically 
valued ways of knowing and thinking, such 
as the epistemic forms of the disciplines that 
are part of the group’s heritage (Collins & 
Ferguson, 1993; Perkins, 1994, 1997). Re- 
sources and practice interact dialectically in 
that individual and group practice trans- 


fect on practice. At the same time, resources 
and practice provide supports for distributed 
intelligence, scaffolding intelligent behavior 
beyond that which can be displayed by an 
individual mind (Salomon, 1993). 

This dialectical interplay between prac- 
tice and resources informs our understand- 
ing of just what the “it” is in which individu- 
als become enculturated. But, how does this 
enculturation happen? How are a culture’s 
practice and resources conveyed and learned 
by group members? In a study of thought- 
ful classrooms, Ritchhart (Ritchhart, 2002) 
identified seven cultural forces at work in 
classrooms that facilitated the process of en- 
culturation in thinking: (1) messages from 
the physical environment about thinking, 
(2) teacher modeling of thinking and dis- 
positions, (3) the use of language of think- 
ing, (4) routines and structures for thinking, 
(5) opportunities created for thinking, (6) 
conveyance of expectations for thinking, and 
(7) interactions and relationships supportive 
of thinking. 

These cultural forces act as direct and in- 
direct vehicles for teaching. For example, the 
use of routines and structures for thinking, 
which connects to the idea of norms and 
heuristics mentioned previously, is a highly 
integrated but still direct form of teaching. 
By introducing “thinking routines” (Ritch- 
hart, 2002), teachers provide students with 
highly transportable tools for thinking that 
they learn in one context and then transfer 
to other situations over time until the strat- 
egy has become a routine of the classroom. 
We and our colleagues are currently capital- 
izing on this approach in the design of a new 
thinking program. The use of the language of 
thinking (Tishman & Perkins, 1997; Ritchart, 
2002) — which includes process (justify- 
ing, questioning, analyzing), product (theory, 
conjecture, summation), stance (challenge, 
agree, concur), and state (confused, puzzled, 
intrigued) words — is a much more indirect 
method of promoting thinking that gives stu- 
dents the vocabulary for talking about think- 
ing. By combining the direct (routines and 
structures, and opportunities) and the indi- 
rect (modeling, language, relationships and 
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culture of thinking is built and sustained. 
One can see these cultural forces at play 
in the Community of Learners approach 
(Brown & Campione, 1994). In this ap- 
proach, a premium is placed on research, 
knowledge-building, and critical thinking, 
thus communicating expectations for think- 
ing to students through the types of oppor 
tunities provided. In this environment, in- 
dividual responsibility is coupled with the 
communal sharing of expertise. Discourse 
(constructive discussion, questioning, and 
criticism) is the norm, making use of the 
language of thinking and interactions and 
relationships supportive of thinking. Ritual, 
familiar participant structures, and routines 
are introduced to help students navigate and 
work within the new culture. All of this is 
accomplished within an environment that 
makes thinking visible for students. 
Research suggests that, at least in this 
particular case, a broad-based cultural ap- 
proach was superior to one based on teach- 
ing heuristics. Approximately ninety fifth 
and sixth graders in the Community of 
Learners (CL) group outperformed a group 
using only a reciprocal teaching technique 
in which students led the learning in read- 
ing discussions on criterion-referenced tests 
of reading comprehension (and this result 
occurred even though the group was given 
twice as much practice as the CL group). 
There was no improvement in a reading-only 
control group. Scores on questions dealing 
with inference, gist, and analogy improved 
dramatically. The results show magnitude of 
effects but require further study to assess the 
generality and persistence of effects. Further 
research is needed to determine whether the 
effects are sustaining in the sense of ongoing 
repertoire, the ultimate goal of a cultural ap- 
proach, or whether their impact is limited to 
behaviors in the immediate environment. 
A common thread running through 
cultural approaches to teaching thinking is 
the effort to make thinking visible, often 
through the various cultural forces. This 
occurs as teachers model their thought pro- 
cesses before the class, students are asked to 
share their thinking and discuss the processes 


coming to conclusions, group ideas and 
conjectures are recorded and reviewed, the 
artifacts of thinking are put on display in the 
classroom, and so on. At the heart of these 
efforts lies reflection on one’s thinking and 
cognitive monitoring, the core processes 
of metacognition. Ultimately, teaching 
students to be more metacognitive and 
reflective, providing rich opportunities for 
thinking across various contexts, setting up 
an environment that values thinking, and 
making the thinking of group members visi- 
ble contribute a great deal to the formation 
of a culture of thinking. The cultural forces 
can be leveraged toward this end. Within 
such a culture of thinking, other efforts to 
teach thinking, both formal and informal, 
have a greater likelihood of taking hold 
because they will be reinforced through the 
culture and opportunities for transfer and 
reflection will increase. 

In summary, in some sense, a fully de- 
veloped culture of thinking in the class- 
room or, indeed, in other settings such as the 
home or the workplace, represents the cohe- 
sive culmination of the separate challenges 
of achieving results, defining the thinking, 
attaining transfer, and attending to think- 
ing dispositions. A thoroughgoing culture of 
thinking attends to all of these. Unfortu- 
nately, the converse is certainly not so. It 
is possible to attend assiduously to the first 
four — say, every Tuesday and Thursday from 
11 to 12, or when we do math projects for a 
day at the end of each unit — and still fall 
far short of a pervasive culture of thinking. 
Results reviewed earlier in this article sug- 
gest that even limited treatments may well 
benefit students’ thinking. However, one has 
to ask about the rest of their learning. In the 
end, the point of a culture of thinking is not 
just to serve the development of thinking but 
to serve the breadth and depth of students’ 
learning on all fronts. 


Conclusions and Future Directions 


This review of the teaching of thinking has 
cast a wide net to look at programs for which 
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discussion. These programs address a great 
variety of thinking — creative and critical 
thinking, problem solving, decision making, 
and metacognition as well as subject-specific 
types of thinking. Even so, we have only 
scratched the surface of the ongoing efforts 
to teach thinking. Why does the teaching 
of thinking continue to be such a central 
question in education? Why do we even 
need to teach thinking? As discussed ear- 
lier, efforts to teach thinking deal with both 
amplifying native tendencies and addressing 
problems of thinking shortfalls. In addition, 
a major goal of most thinking interventions 
is to enhance learning and promote deeper 
understanding. The idea that deep and last- 
ing learning is a product of thinking provides 
a powerful case for the teaching of thinking. 
Indeed, we venture that the true promise of 
the teaching of thinking will not be realized 
until learning to think and thinking to learn 
merge seamlessly. 

Toward this end, we singled out five 
challenges that must be dealt with along 
the way. The first addressed the question 
of whether or not thinking can be taught 
with some reasonable signs of success. We 
reviewed several programs as a kind of 
existence proof that, indeed, it is possi- 
ble to produce impacts with substantial 
magnitude, persistence, and transfer. These 
programs spanned a variety of philosoph- 
ical and methodological approaches while 
sharing the common characteristics of in- 
creasing the demand for thinking, devel- 
oping thinking processes, and paying at- 
tention to metacognitive self-regulation. 
These common demand characteristics ap- 
pear to be key elements in the teaching 
of thinking. 

The second challenge concerned what 
one means when talking about good think- 
ing. We showed how efforts to teach think- 
ing are shaped largely by how they answer 
this question. Thus, the content, sequence, 
and methods of instruction for a particular 
intervention arise from a single or collec- 
tive set of grounding theories, be they linked 
to norms and heuristics, intelligence, or hu- 
man development. Interestingly, programs 


achieved substantial success. Why should 
this be? Does theory matter at all? As with 
the first challenge, the answer to effective- 
ness may lie more with certain demand char- 
acteristics of programs than with any single 
theoretical approach. Increased explicit in- 
volvement with thinking and systematic at- 
tention to managing one’s thinking may be 
the most critical conditions. To untangle this 
issue empirically, one would need to com- 
pare the effectiveness of programs with dif- 
ferent theoretical bases but with the same 
demands for thinking and reflection. Unfor- 
tunately, it is rare in the literature on the 
teaching of thinking to find alternative ap- 
proaches addressing the same kinds of think- 
ing and the same sorts of learners pitted 
against one another. 

The third challenge dealt with the dispo- 
sitional side of thinking. We showed how the 
effective teaching of thinking is more than 
just the development of ability, demand- 
ing the development of awareness and in- 
clination as well. In particular, the lack of 
a sensitivity to occasions for thinking ap- 
pears to bea major bottleneck when it comes 
to putting one’s abilities into action. It is 
our belief that some programs accomplish 
this. Although most data focus on abilities, 
leaving impact on sensitivity and inclination 
unassessed, there are a few indications of im- 
pact on dispositions. Certainly, more work is 
needed in this area. 

Transfer, a pivotal concern within the 
teaching of thinking, constituted our fourth 
challenge. Although some have argued that 
transfer cannot be obtained because all 
knowledge is bound to context, the empiri- 
cal record of successful programs has shown 
clearly that some degree of transfer is pos- 
sible across domains of content knowledge. 
This is by no means automatic, however. 
Transfer must be designed deliberately into 
interventions by highlighting key features of 
the situation that need attention, promot- 
ing reflective abstraction of underlying prin- 
ciples, and providing practice across multiple 
contexts. Even then, one is more likely to see 
near transfer of thinking to similar contexts 
than far transfer. 
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tures of thinking, examined the social con- 
text and environment in which thinking is 
fostered. Efforts to teach thinking cannot be 
removed from their social context. Context 
provides important avenues for the devel- 
opment of supporting inclinations toward 
thinking, learning from more accomplished 
peers, focusing attention, and access to the 
resources and practices of the group. In class- 
rooms, a set of cultural forces directs and 
shapes students’ learning experiences both 
directly and indirectly. These cultural forces 
convey to students how much and what 
kinds of thinking are valued, what meth- 
ods the group uses to go about thinking, 
and what expectations there are regarding 
thinking. Furthermore, the thinking of indi- 
viduals and groups is made visible through 
these forces. 

Our review of these five challenges sug- 
gests several fronts for further investigation: 


¢ The questions of transfer and sustained 
impact need to be better understood. 
In particular, little is known about the 
impact of extended interventions. One 
might expect that broad multi-year in- 
terventions would yield wide impact sus- 
tained for many years, but the empirical 
work has not been done to our knowl- 
edge. Relatedly, what would be the ef- 
fect of a cross-subject thinking interven- 
tion in which students encounter the 
same practices concurrently in multiple 
disciplines? 

¢ An exploration of the trade-offs among 
the norms and heuristics, models of intel- 
ligence, and developmental approaches is 
needed to better understand the role of 
theory in successful interventions. How 
and where does the underlying theory 
of thinking matter? When demands for 
thinking are held constant, does one 
theoretical approach work better than 
another? What is it that makes success- 
ful programs work? What characteris- 
tics and practices are most pivotal to 
success? 


¢ Within the realm of thinking dispositions, 
there is much to be learned. How success- 


the dispositional side of thinking? What 
kinds of practices and interventions ef- 
fectively foster students’ inclination and 
sensitivity? Are dispositions bound to the 
social context in which they are devel- 
oped or do they transfer to new settings? 
How does attention to the development 
of sensitivity to occasions affect transfer of 
thinking skills? Efforts to teaching think- 
ing skills are sometimes done in a limited 
time frame, raising the question: What is 
the appropriate time frame for the devel- 
opment of dispositions? 


Perhaps the biggest question about the 
teaching of thinking concerns how to inte- 
grate it with other practices, in school and 
out of school, in an effective way. We already 
know enough about the teaching of think- 
ing to have a substantial impact, and yet the 
reality of collective practice falls short. We 
must ask ourselves: How can thinking ini- 
tiatives be sustained and integrated with the 
many other agendas faced by schools, mu- 
seums, clubs, corporate cultures, and other 
settings in which thinking might thrive? 
Only when we understand how to foster 
cultures of thinking not just within in- 
dividual families or classrooms but across 
entire schools, communities, and, indeed, so- 
cieties, will scholarly insights and the prac- 
tical craft of teaching thinking achieve their 
mutual promise. 
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constraint-satisfaction network, 13 4 
LISA vs., 134-135 
multiconstraint theory in, 134 
predicate matching in, 134 


ACT-R (production system), 137, 404, 408, 425 


analogy in, 409 

attentional activation in, 412 
BST in, 407 

chunk strengthening in, 416 


cognitive modeling architecture in, 404, 406 


differential activation in, 413 
in implicit cognition, 433 
knowledge structures in, 405 
language learning and, 422 
latency responses in, 419 
MODS trial in, 413 
parallelism in, 405 

past tense generation in, 423 


production-rule learning in, 405 
RULEX model in, 416 
source activation in, 414 
subsymbolism in, 406, 408, 422 
WM and, 412, 413-414, 416, 470 
adaptive regression, 353 
Adaptive Strategy Choice Model. See ASCM 
additive extension effect, 280, 282 
factorial designs and, 
WTP and, 283 
affective thinking, 311 
affective valence, 271 
Affirmation of the Consequent (deductive reasoning), 
172 
AG (artificial grammar) 
in implicit cognition, 43.4 
in implicit learning, 441 
standard learning tasks for, 43 4 
Agent-Based Modeling and Behavior Representation. 
See AMBR 
aging (cognitive) 
componential models for, 598-600 
comprehension and, 593 
correlated factor models for, 600-602 
correlation analysis and, 597-604 
factor loading variables and, 603 
Figure Classification test and, 599 
geometric analogies and, 594, 596 
hierarchical structure models and, 602-604 
LISA and, 87 
Location test and, 599 
matrix reasoning and, 594, 602 
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mediational models for, 597-598 
process-oriented research on, 593-597 
reasoning and, 567-568, 590, 591-593 
relative distribution of inspection and, 594 
response speed and, 593-594 
serial position functioning and, 596 
series completion tasks and, 594 
strategy use and, 594-595 
study times, relative to, 594 
WCST and, 594, 595, 670 
WM and, 595-597 
algorithms 
in analysis, 5 
exhaustive search, 325 
in MDS models, 16 
in problem solving, 325 
alignment-based models (for similarity), 24-26 
alignable differences in, 25 
formal, 24-25 
matching features within, 24 
non-alignable differences in, 25 
object sets in, 24 
rigid transformations within, 26 
SIAM, 2 
verbal structural similarity in, 26 
AMBR (Agent-Based Modeling and Behavior 
Representation), 425 
American Bar Association, 694 
anagrams, 343 
pop-out solutions and, 343 
Analog Retrieval by Constraint Satisfaction model. See 
ARCS 
Analogical Mapping by Constraint Satisfaction. See 
ACME 
analogs 
in visuospatial reasoning, 225 
analogy, 5. See also similarity 
abstract, 4.4.4 
ACME model for, 13.4 
ACT-R production systems and, 137, 409 
“alignable differences” in, 128 
case-based reasoning and, 121 
categorization in, 122 
causal references in, 122 
in cognitive development, 5 41 
coherence as part of, 125 
component transfer processes for, 117, 122 
computational models of, 131-136 
convergence schemas in, 130 
Copycat model for, 131 
CWSG as part of, 128 
false recognition in, 129 
4-term, 118, 119 
geometric, 594, 596 
IAM, 131 
inference generalization as part of, 128-131 
intelligence and, 758 
knowledge representation in, 121-122 
“known” domains in, 409 
in legal reasoning, 686, 687-688 
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mapping in, 117, 124-127, 410, 424, 531 
in mental model theory, 187 
metaphor and, 120-121 
motion cue diagrams for, 130 
multiple comparisons in, 130 
non-human primates and, 611-613 
People Pieces task, 465 
political uses of, 125-127 
problem schemas in, 130 
problem solving and, 122, 136-137 
processing goal impact within, 125 
in production systems, 402, 409-412 
psychometric tradition in, 118-120 
relational generalization as part of, 130-131 
“relational shift” in, 124 
relations in, 121-122, 124 
retrieval impact as part of, 123-124 
roles in, 410 
science and, role in, 117-118 
in scientific reasoning, 713-714 
SME model and, 132-134 
solar systems model for, 409 
source analogs as part of, 117, 122, 714 
STAR model for, 132 
“story memory” as part of, 122 
structural parallelism in, 122 
symbol-argument consistency in, 531 
target analogs as part of, 117, 123-124, 714 
task-interference paradigm and, 465 
transfer paradigms for, 122-123 
WM and, 120 
WM in, 127-128 
analysis, 5. See also thinking 
computation in, 5 
implementation as part of 5 
representation/algorithm as part of, 5 
anaphoric reference, 562-564, 636 
animations 
diagrams and, 230 
limitations of, 230 
anomic aphasia, 499 
ANOVA model 
for causal learning, 14.7 
ARCS (Analog Retrieval by Constraint Satisfaction) 
model, 13 4 
ACME and, 134 
arithmetic 
look-up tables and, 576 
mathematical cognition and, 560 
number fact retrieval in, 574-575 
reaction time predictors in, 574 
artificial intelligence systems 
medical reasoning and, 728 
mental model theory and, 187 
ASCM (Adaptive Strategy Choice Model), 538 
associative learning systems 
causal learning and, 147 
ceiling effects and, 152-153 
cue-outcome contingency as part of, 147 
dual process theory and, 180 


SUBJECT INDEX 


833 


inductivP teagan py : ints; /A4Etiiiianargnco Min naive theory (inductive reasoning), 109 


temporal contiguity as part of, 147 
asymmetry 
causal, 112 
in conflict decisions, 250 
hemispheric, 485-487 
predictions (causal) and, 151 
in similarity-based induction, 103 
asynchronous parallelism, 426 
atomic chemical reaction theory, 373 
atomistic concepts, 52-53 
attribute framing, 249 
attributes 
affective valence and, 271 
compatibility between, 252-253 
in directional outcome-motivation, 297-298 
evaluation of, 252-25 4 
extensional, 282 
framing of, 249 
natural assessments for, 271 
non-directional outcome-motivation and, 301 
spreading activation and, 249 
substitution of, 269-273, 274, 287 
target, 269 
auditory hallucinations, 505 


backpropagation models 
in dynamic systems models, 5 36 
in neural net models, 5 33 
BACON program, 357 
balance scale model, 5 32 
cognitive development and, 5 48-5 49 
diagram, 532, 533 
structure of, 534 
Bayesian models 
of categorical inference, 110 
extensional reasoning and, 197 
for inductive reasoning, 110-111 
in WTP surveys, 283 
belief bias, 175-177, 179 
belief systems, 3. See also intuitive theory 
belief-laden material 
cerebellum effects of, 485 
hemispheric asymmetry and, 485 
belief-logic conflicts 
cerebellum effects of, 487 
cognitive neuroscience and, 487-488 
inhibitory trials for, 487 
belief-neutral material 
cerebellum effects of, 485 
The Bell Curve (Herrnstein/Murray), 766 
bias 
belief, 175-177, 179 
content/context effects of 178-179 
in deductive reasoning, 174-175, 176 
external validity problem and, 174-175 
heuristics and, 268, 270 
interpretation problem and, 175 
in language, 646 
matching, 174-176 
in MDS models, 19-20 


negative conclusion, 174-175 
normative system problem and, 173-175 
path, 646 
in similarity-based induction, 107 
weighting, 270 
The Big Book of Concepts (Murphy), 37 
binding 
argument, 76 
in conjunctive connectionist representations, 54 
dynamic, 84 
multiplicative schemes, 82 
relational representation, 82 
role-filler, 82, 83-84, 87 
“black box” theories, 478 
blocking, 148 
backward, 149, 155 
bootstrapping 
knowledge structure repairs in, 389 
in language, 643, 649, 653 
nonmonotonicity and, 389-390 
Boyle’s law 
in power PC theory, 153 
Bradley-Caldwell (intelligence) studies, 765 
brain regions 
domain specificity and, 60 
BST (Building Sticks Task), 407 
in ACT-R, 407 
guess-overshoot production rules in, 407 
guess-undershoot production rules in, 407 
hillclimb-overshoot in, 408 
hillclimb-undershoot in, 408 
leaned utility values average in, 408 
overshoot in, 407 
undershoot in, 407 


Building Sticks Task. See BST 


Cambridge Handbook on Thinking and Reasoning, 688 
CAPS 

constraints for, 413 

WM in, 412 
cascade correlation models, 535, 540, 548 
Cascade model (skill acquisition), 419 
CASE (Cognitive Acceleration through Science 

Education) program, 783 

case-based reasoning 

in analogy, 121 
categorization (conceptual), 38, 415-417 

analogy and, 122 

basic level, 40 

causal learning and, 161 

decision bound, 46 

EBRW model, 416 

exemplar views of, 45 

family-resemblance, 57 

feature learning within, 47-48 

goal planning and, 38 

identity-lending, 57 

in inductive reasoning, 100 

inference and, 44-48 

inference learning in, 47 
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inside view of, 102 
learning and, 38 
in legal reasoning, 698-699 
multiple function sensitivity in, 62 
multiple strategy approach for, 424 
multiple systems of, 46-47 
neural network, 46 
new models of, 45-46 
outside view of, 102 
predicates’ role in, 100 
prototype views within, 44 
rational, 46 
reasoning as function of, 48 
RULEX model, 416 
sensitivity within, 45 
structure within, 47 
transfer tests in, 47 
typicality effects within, 44 

causal learning 
abstraction levels for, 161 
acyclic constraints in, 153 
ANOVA model for, 147 
applications of, 145, 146 
associative learning theory and, 147 
Bayesian integration accounts as part of 145 
bi-directional associations and, 151 
blocking in, 148, 149, 155 
category formation and, 161 
ceiling effects in, 152-153 
coherence of, 159 
computational-level induction theory for, 153-155 
contiguous event pairings in, 162 
contingency tables for, 147 
correlations as part of, 151 
covariation vs., 144,154 
cue-outcome contingency and, 147 
cues as part of, 147 
diagnostic conditions as part of, 151 
directions of, 151-152 
empirical knowledge and, 145 
enabling conditions in, 160-161 
flexibility of, 159 
historical background of, 145 
human contingency judgment studies for, 156-158 
inference and, 217 
intervention in, 144, 160 
iterative retrospective revaluation in, 159-160 
knowledge mediation hypothesis as part of, 163 
“launching effects” in, 145, 162 
learning process and, 144 
mechanism view of, 146 
“no confounding” principle in, 154 
noncausal observations and, 14.4 
normative deviations in, 158-159 
outcomes as part of, 147 
perception and, 145, 162 
power PC theory of, 153,154 
predictions as part of, 144, 151, 152 
predictive learning vs., 147 
probabilistic contrast in, 147 


release-from-overshadowing condition in, 155-156 
situational statistical models for, 147-148 
support models for, 158 
temporal contiguity and, 147 
time and, 161-164 
tradition rule-based models for, 153 
unbound variables in, 153 
causal thinking 
in medical reasoning, 736-738 
in scientific reasoning, 710, 711 
“cause and effect.” See also causal learning 
asymmetry in, 112 
inductive reasoning, role in, 95, 98, 99, 
112-113 
relevance in, 112 
CCC (Cognitive Complexity and Control) theory, 
538-539 
relational complexity metric and, 539 
ceiling effects 
associative learning models and, 152-153 
in causal learning, 152-153 
covariation vs. causation in, 152 
inhibitory cues and, 152 
power PC theory and, 155 
central executive 
in Embedded Processes model, 462 
functions of, 461 
in multiple working memory models, 458 
Supervisory Attentional System in, 460-461 
central executive (memory model), 458, 460 
certification theory, 355 
childhood development 
relational thinking during, 73 
choice. See also decision making 
BST and, 407 
in production systems, 406-409 
chunk decomposition, 34.4 
chunking 
conceptional, 539 
in Q-SOAR model, 532 
in Soar, 419 
Church-Turing hypothesis, 476, 489-490 
circular convolutions, 81, 83 
city-block metrics 
in MDS models, 16 
class inclusion (cognitive development), 546 
classical conditioning, 505 
classification (cognitive development), 545-547 
prototypes models of, 5 45 
clinical psychology, 4 
“code-switching,” 653 
cognition 
creativity and, 356-358 
mathematical, 559 
medical, 727 
motivated thinking vs., 296 
non-directional outcome-motivation and, 312 
similarity in, 14 
Cognitive Acceleration through Science Education 
program. See CASE 
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“cognitive collage” rotation and, 22 
cognitive maps vs., 22 Cognitive Models of Science (Giere), 708 
Cognitive Complexity and Control theory. See CCC cognitive neuroscience, 5 
theory belief-logic conflicts and, 487-488 
cognitive deficit integration brain localization functions, role in, 477 
thought disorder and, 509-510 for deductive reasoning, 476-477 
Cognitive Development (Bryant), 530 dissociation and, 477-478 
cognitive development (children), 529-530. See also fundamental processes of, 6-7 
thinking future applications of, 489 
analogy in, 5 41 hemispheric asymmetry and, 485-487 
“apprenticeship” in, 790, 792 mental logic theory and, 479 
ASCM and, 538 mental model theory and, 478-479 
balance scale in, 548-5 49 real-time computational models for, 477 
brain development links for, 53.7 Cognitive Research Trust program. See CoRT 
CASE program for, 783 Cognitive Revolution, 211, 295 
category induction in, 546 cognitive development and, 530 
causal reasoning in, 548 non-human primates, role in, 593 
CCC theory in, 538-539 Cohen & Braver’s model (thought disorder), 
class inclusion in, 546 500-503 
classification in, 545-547 cognitive control mechanism as part of, 508 
Cognitive Revolution and, 530 diagram, 500 
complexity in, 538-539 lexical disambiguation task in, 501 
concept of Earth in, 549 “one-back” continuous performance task in, 501 
concept of mind in, 5 47 phasic dopamine activity in, 502 
concrete operational stage as part of, response levels in, 502-503 
530 Stroop task in, 501 
conservation in, 544 coherence 
criticism of, 530 consistency and, 382 
domain knowledge and, 5 40 collectivism 
domain specificity in, 5 41 cultural thought and, 665, 669, 672 
dynamic systems models in, 549 gender differences within, 669 
equilibration in, 530 individualism vs., 675-676 
evidential diversity principle and, 5 48 object contextualization in, 674-675 
expertise and, 540-541 communication 
formal operational stage as part of, 530 within concepts, 38 
“function logic” in, 530 diagrams for, 227-230 
groupings as part of, 530 inter-modular and (language), 651 
increased dimensionality in, 539-540 Community of Learners program, 794 
information integration theory for, 548 compatibility, 252-253 
Knowledge Forum for, 783 in attribute evaluation, 252-253 
language and, role in, 531 principle of, 253 
preoperational stage of, 530 complex knowledge systems, 371 
reasoning processes in, 541-543 declarative, 371 
relational complexity metric in, 539, 548 procedural, 371-372 
scientific thinking about, 547-548 componential theory (intelligence), 757-758 
sensorimotor stage of, 529 compromise effect, 250 
strategy development for, 537-538 computational models 
symbolic categories in, 545-547 ACME, 134 
time/distance/velocity and, 548 for analogy, 131-136 
transition mechanisms in, 5 49 for causal learning (induction), 153-155 
transitivity in, 544-545 cause in, 137 
Wason selection task and, 5 43 CWSG, 128 
zone of proximal development in, 531 data mining in, 719 
cognitive maps limitations of, 137 
alignment and, 223-224 LISA, 134-136 
“cognitive collage” vs., 224 in psychometric tradition (analogy), 119 
in distortions program, 222 in scientific methodology, 109-111 
hierarchical organization in, 222-223 for scientific reasoning, 719 
information levels in, 223 SME, 132-134 


landmarks in, 223 stimulus generalization (causal learning), 156 
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language programming systems, 74 
simulations, 4 
Computer Supported Intentional Learning 
Environments program. See CSILE 
concept of mind, 547 
appearance-reality testing for, 547 
executive function relation to, 547 
false-belief tasks for, 547 
social-perceptual knowledge and, 5 47 
“theory of mind” vs., 547 
concepts 
broad applications of, 62 
categorization within, 38, 40 
combination theories, 39, 40, 49-52, 60 
communication within, 38 
comprehension of, 38 
direct observation within, 50 
domain specificity within, 58-62 
functions of, 37, 38-39, 44-48 
hypothesis testing, 40 
indirect observation within, 50 
inferential/atomistic, 52-53 
language functions within, 48-5 4 
memory and, 39, 40-41, 48-49 
mental pathways between, 39 
as mental representation, 37 
nature of, 6 
“No Peeking Principle” as part of, 50 
nonproprietary, 61 
prediction for, 38 
preexisting structures for, 715 
proprietary, 61 
psychodiagnostic classification within, 5 4 
psycholinguistic research and, 62-63 
psychological essentialism, 56-57 
psychometaphysics and, 55 
radical change for, 715 
“semantic memory marriage” as part of, 39 
sortalism, 57-58 
theories of, 54-56 
conceptual knowledge 
perceptual vs., 107 
in similarity-based induction, 107 


Conditional Proof (suppositional reasoning), 171 
conditional reasoning. See deductive reasoning 


conditioned inhibition, 148 
The Conditions of Learning (Gagne), 392 
“Confirmation Bias” 
in scientific reasoning, 709 
variants in, 709 
WM limits and, 710 
conflict, 249-251 
asymmetric dominance and, 250 
compromise effect as result of, 250 
default alternatives and, 250 
evasion, 358 
regularity condition and, 249 
resolution, 402 
sets, 402 
status quo in, 249 


abeyance in, 388 
bolstering in, 388 
in declarative knowledge, 388 
recalibration in, 388 
conflict sets, 402 
conjunctive connectionist representations, 
81-83 
binding storage and, 84 
circular convolutions in, 81, 83 
HRRs in, 81, 53 
implicit relations continuum as part of, 83 
LISA and, 84 
LTM storage, 54 
SAA isomorphism and, 82 
sparse coding in, 83 
spatter codes in, 81, 83 
“superposition catastrophe” within, 83 
tensor products in, 81 
connectionist representations 
conjunctive, 81-83 
diagram of, 80 
distributed activation vectors in, 78 
eliminative, 79 
flexibility of, 78-79 
identity functions for, 80 
latent semantic analysis in, 78 
patterns in, 75 
relational capacity of, 81 
SAA vs., 78 
specific function inputs for, 80 
Story Gestalt model and, 79 
summary of, 87 
as symbolic relational representation, 78-81 
conservation 
acquisition in, 544 
in cognitive development, 544 
Q-SOAR model and, 54.4 
consistency (declarative knowledge), 380-382 
coherence and, 382 
vs. veridicality, 381 
constraint relaxation, 344 
Contention Scheduler 
in Supervisory Attentional System, 460 
context 
bias and, effects on, 178-179 
in deductive reasoning, 174-176, 179 
perceptual form, 331-332 
in problem solving, role of, 331-332 
in visuospatial reasoning, 225 
contingent valuation method. See CVM 
The Contrast Model 
asymmetric similarity prediction by, 20 
featural analyses in, 20-21 
“Hamming distances” in, 21 
MDS and, 20 
neural network representations in, 21 


object sets in, 20, 23 


copy with substitution and generation. See CWSG 


“correction model” 
dual process theory vs., 268 
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778,779, 781 Environments) program, 753, 787, 790, 793 
counterfactual thinking cue-outcome contingency 
additives in, 308 in associative learning theory, 147 
in directional outcome-motivation, 298 blocking in, 148 
in regulatory focus theory, 308 in causal learning, 147 
in strategy-motivated thinking, 307-308 conditioned inhibition in, 145 
subtractives in, 308 cue validity in, 147, 148 
covariation, 144, 146 overshadowing in, 148 
causal learning vs., 14.4, 154 cultural thought. See also culture 
in ceiling effects, 152 behavioral interpretations and, 666 
studies on, 155-156 collectivism and, 665, 669, 672 
creative thinking. See creativity demographic factors for, 669 
creativity, 351 difference patterns in, 673 
adaptive regression in, 353 in dispositional thinking, 787-788 
BACON program and, 357 enthnotheories of intelligence and, 668 
blind variations in, 359 “Essentialist” model of, 674 
candle problem and, 356 ethnography and, 665-666, 667 
cognitive approaches to, 356-358 false-belief tasks and, 673 
computer simulation approaches to, 357 importance of, 792 
confluence approaches to, 359-361 logic and, 676 
contribution scaling for, 363-364 narrative vs. logical-scientific in, 675 
divergent thinking tasks for, 354 négritude and, 666 
elaboration in, 353 practice as part of, 793 
environmental role in, 351 problems in, 667 
evolutionary approaches to, 359 “Relationship” model for, 674 
evolving-systems model for, 360 religion as factor in, 670 
explicit theories for, 360 resources in, 793 
Genoplore model for, 356 sibling caregiving and, 672 
implicit theories for, 360 social ideology levels and, 666-668 
intelligence and, 354-355 theory of mind and, 670-674 
intrinsic motivation for, 358 for thinking, 796 
investment theory of, 361 “thinking routines” and, 793 
lateral thinking and, 352 visual pattern construction in, 676-677 
Lorge-Thorndike Verbal Intelligence scores and, “Culturally Situated Cognition” 
355 (Ceci/Kopko/Wang/Williams), 677 
mystical approaches to, 352 culture 
g-dot problem and, 356 positive self-evaluation and, 312 
pragmatic approaches to, 352-353 reasoning as effect of, 626 
primary, 361 Culture and Thought (Cole/Scribner), 667 
propulsion theory of, 363 CVM (contingent valuation method) 
psychodynamic approach to, 353-354 scope neglect and, 282 
psychometric approaches to, 35 4-356 CWSG (copy with substitution and generation) 
RAT for, 355 analog inference and, 128 
SAT scoring and, 355 computational models’ use of, 128 
secondary, 362 constraints on, 129 
selective retention in, 359 SME and, 132 
self-actualization and, 358 variable binding/mapping in, 128-129 
social environment and, 358 
social-personality approaches to, 358-359 DAPAR test (thinking), 778 
synectics in, 353 decision making 
“thinking hats” in, 353 attribute evaluation in, 252-254 
Torrance Tests of Creative Thinking and, 354 bounded rationality in, 244 
Unusual Uses Test in, 354 conflicts in, 249-251 
WISC and, 355 decision utility and, 259 
Critical Legal Studies movement, 694, 695-696 default alternatives in, 249, 250 
judges and, 695 description invariance in, 246 
legal realism theory and, 695 emotions, role in, 258 
cross-dimensional mapping (for heuristics), 272 experienced utility and, 259 
cross-modality matching and, 272 frame of mind for, 257-258 


univariate matching and, 272 framing effects for, 246 
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insurance and, 245, 246-247 
local/global perspectives in, 25 4-258 
loss aversion, 247-249 

mental accounting in, 255-25 
normative intuitions in, 254 
perceived certainty in, 246 
preference inconsistencies in, 251 
prospect theory of, 244-246, 260 
rational theory of choice and, 243, 244 
reason-based, 251-252 

reasoning and, 2 

repetition in, 254-255 

risk framing in, 246-247 

riskless choice in, 247-249 
segregated opportunity in, 255 
semantic framing in, 249 


separate vs. comparative evaluation in, 253-254 


status quo bias in, 248-249 

subjective utility in, 243 

Sure Thing Principle in, 251 

temporal discounting in, 256-257 

under uncertainty, 244-247 
decision-analysis (medical reasoning), 728 

quantitative model of inference in, 728 
declarative knowledge, 371 

abstraction levels in, 384 

accretion of, 376 

assembly process within, 383-384 

assimilated information as part of, 377 

assimilation distortion in, 388 

atomic chemical reaction theory in, 373 

center-periphery structures within, 373-374 

complexity within, 383-384 

computational power in, 388-389 

conceptual combinations within, 384 

conflict evasion in, 388 

connectedness within, 377-380 

consistency in, 380-352 

domain grouping in, 373 

dynamic equilibrium within, 386 

egocentrism in, 385 

evolution theory and, 371, 373 

exocentrism in, 355 

explanation patterns as part of, 375 

finer grain of representation in, 382-383 

intuitive theories within, 374 

knowledge base size in, 374, 376-377 

learning paradox in, 389 

monotonicity in, 356 

nonmonotonicity in, 359-392 

organization in, 373-376 

perspective change in, 354 

plans in, 37 

schemas in, 374-375 

scripts as part of, 375 

semantic networks within, 373 

theory representation within, 373-374 

vantage point shifting in, 384-385 
deductive reasoning, 2 

Affirmation of the Consequent as part of, 172 


cognitive neuroscience and, 476-477 
content effects on, 187 
context effects on, 174-176, 179 
Darwinian algorithms, 173 
defeasible inferences in, 178 
Deontic Selection Task for, 174-176, 178 
dual mechanism theories and, 476 
dual process theory and, 179-180, 181 
errors in, 713 
hueristic-analytic theory and, 173-175 
inference rules systems for, 171-172, 475 
language and, 9 
legal reasoning by, 686-687 
mental logic theories and, 475-476 
mental model theory and, 172, 181, 190-191 
‘natural logics’ as part of, 171, 181 
origins of, 169 
pragmatic reasoning schemas in, 173 
principle of truth in, 172 
psychological theories of, 475-476 
in scientific reasoning, 712-713 
syllogisms and, 169, 713 
Deontic Selection Task, 174-176, 178, 180 
Darwinian algorithms in, 177 
pragmatic reasoning schemas in, 177 
Wason selection task and, 174-176, 180 
“derailment,” 4.96 
verbal, 509 
developmental psychology 
central conceptual structures within, 784 
evolution of, 783-784 
sortalism’s role in, 57 
diagnosticity effects 
in similarity, 29 
diagram 
motion cue, 130 
diagrams 
animations and, 230 
for communication, 227-230 
for connectionist representations, 80 
Duncker radiation problem and, 230 
enrichment of, 229-230 
expertise and, 229 
extra-pictorial devices for, 230 
graph comprehension models and, 22 
iconic, 157 
inferences from, 227-229 
for insight, 230-231 
in mental model theory, 186 
motion cue, 130 
multiple sense of, 230 
g-dot problem, 230 
“reading off,” 227 
in visuospatial reasoning tasks, 219, 22 
direct conjunction fallacy, 276 
directional outcome-motivation, 296, 297 
accessibility in, 299 
attribution in, effects on, 297-298 
circumstance distinction within, 305 
closure motivation in, influences on, 305 
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cognitive-resource constraints on, 304 
concept organization, effects on, 300 
counterfactual thinking in, 298 
evidence evaluation for, effects on, 298 
extended information processing in, 304-305 
influences on, 297 
information search in, effects on, 299 
knowledge activation in, 299 
memory reconstruction in, 299 
positive self-evaluation within, 298, 300 
reality constraints on, 303-304 
recall and, 299-300 
dispositional bias, 699-700 
dispositional thinking, 777, 785, 795 
behavioral effects of, 786 
cognitive closure in, 786 
cultivated, 787-788 
empiricism of, 786-787 
entity learners in, 786 
habits of mind in, 785 
incremental learners in, 786 
logic and, 785-786 
dissociable neural networks, 481-482 
dissociation (brain functions), 478 
cognitive neuroscience and, 477-478 
4-card Selection Task and, 485 
language and, 478 
patient data evidence for, 484-485 
transitive reasoning and, 484 
distortions program 
cognitive maps in, 222 
representations and, 221-222 
symmetry in, 222 
in visuospatial reasoning, 221 
diversity 
age-based sensitivity for, 103 
in scientific induction methodology, 108 
in similarity-based induction, 103 
diverted attention, 436 
dual-task paradigms in, 436 
DMTS tasks, 612 
domain specificity 
in analogy, 409 
biology (naive), 59 
brain regions and, 60 
child development and, 5 41 
classification and, 5 41 
in cognitive development, 541 
within conceptual development, 58-62 
folkbiology and, 59 
inheritance theory in, 60 
interdomain differences and, 59-60 
mechanics, 387 
memory and, 60-61 
multiple storage theories within, 61 
nonproprietary concepts and, 61 
“novel,” 409 
physics (naive), 59 
in production systems, 409 
proprietary concepts and, 61 


semantic-based memory structures and, 60 
transitive inference within, 5 41 
domain-general information sources, 14 
domains 
before-after structures in, 373 
cause-effect in, 373 
in declarative knowledge, 373 
hierarchies in, 373 
local structuring in, 373 
naive biology, 59, 61 
naive law-school, 61 
naive physics, 59, 61 
dopamine 
functions, 505 
phasic activity, 502 
receptor binding, 512 
dot numerosities, 577 
subtizing for, 577 
dot-arrays, 572 
“double disjunction” 
in mental model theory, 193 
dual mechanism theories 
deductive reasoning and, 476, 479 
dual process theory 
associative learning and, 180 
central working memory resources as part of, 180 
“correction model” vs., 268 
in deductive reasoning, 179-180, 181 
factors for, 268 
heuristics and, 267-268 
implicit learning and, 179 
intuitive judgment and, 179 
System 1 in, 180, 267 
System 2 in, 150, 267 
duration neglect 
as attentional phenomenon, 285 
End Affect in, 284 
in experience evaluation, 283-285 
Peak Affect averages and, 284 
Peak/End Rule in, 284 
in prototype heuristics, 282 
substantial variations effects of, 284 
dynamic systems models (cognitive development), 
536-537, 549 
backpropagation models in, 536 


EBRW (exemplar-based random walk) model, 416 
in WM, 416 
economics, 700 
legal reasoning and, 700 
elaboration (creativity), 353 
electronic medical records. See EMR 
elementary transformations, 215-216 
candidate catalog in, 215-216 
eliminative connectionist representations, 79 
inferences within, 79 
Embedded-Processes memory model, 462-463 
active memory in, 462 
central executive as part of, 462 
diagram for, 462 


840 SUBJECT INDEX 


Embedded-Prodekeemeg ee Dby: Inttpss /4Etitiianargoeeuan and, 337 


“focus of attention” in, 462 
modality specificity in, 463 
emotional intelligence, 752 
emotions 
affect heuristics and, 258 
anticipatory, 255 
decision making and, 258 
empathy gaps and, 258 
frame of mind, role in, 258 
inconsistency as result of, 258 
reasoning and, 312 
empathy gaps, 258 
EMR (electronic medical records), 74.4 
concept-based, 74.4 
displays for, 74.4 
source-based, 74.4 
time-based, 74.4 
endowment effects, 247 
entity learners, 786 
entrenchment theory 
inductive reasoning, 97 
kind in, 98 
limitations of, 97-98 
similarity in, 98 
enumeration 
inductive reasoning by, 97 
EPIC (production system), 404-405 
parallel production-rule firing in, 405 
production rules within, 412 
Soar combination with, 425 
WM in, 413 
episodic buffers 
functions of, 461-462 
in multiple working memory models, 458 
equiprobability. See principle of equiprobability 
ERPs (event-related potentials), 342 
in scientific reasoning, 716 
essentialism. See psychological essentialism 
Euclidean metrics 
in MDS models, 16 
Euler circles 
medical technology and, 743 
in mental model theory, 190, 191, 192 
event-related potentials. See ERPs 
“Everyday Cognition” (Carraher/Ceci/Schliemann), 
677 
evidential diversity principle, 548 
evolution theory, 371, 373 
replacement systems and, 391 
evolving-systems model, 360 
exemplar-based random walk model. See EBRW 
experiment spaces 
hypothesis spaces vs., 709 
in scientific reasoning, 708 
expertise 
cognitive development and, 540-5 41 
diagrams and, 229 
inference, effects on, 231 
problem representations and, 336 
in problem solving, 336-338 


similarity and, 29 
in transfer of learning processes, 790-791 
Explaining Science (Klahr), 708 
explicit induction, 199, 200-203 
“conjunction” fallacy in, 202 
explanations abduction as part, 201 
Keplers third law, 200 
extension effects 
in similarity, 29 
extensional reasoning, 197, 198 
Bayesian reasoning and, 197 
external validity problem (biases), 174-175 


false-belief tasks 
cultural thought and, 673 
feature exclusion 
in similarity-based induction, 104 
feature learning, 47-48 
similarity and, 48 
“feeling of knowing,” 4.45 
fetal hypoxia 
schizophrenia and, 514 
thought disorder and, role in, 511 
Figure Classification test, 599 
finer grain of representation 
in declarative knowledge, 382-383 
emergent systems in, 383 
FINST system, 581 
pointers, 582 
Fluxions (Newton), 715 
fMRI (functional magnetic resonance imaging), 
468 
conceptual change and, 718 
in scientific reasoning, 716 
“focus of attention” 
in Embedded-Processes memory model, 462 
primary memory vs., 462 
folkbiology, 59. See also domain specificity 
taxonomies within, 59 
foresight, 4 
formalism theory (legal reasoning) 
criticisms of, 689 
“first principles” in, 688 
Law and Economics movement and, 694 
in legal reasoning, 688-690 
4-CAPS (production system), 404, 405 
parallel firing in, 405 
4-card Selection Task, 277, 485 
content in reasoning and, 485 
dissociation and, 485 
permission schemas in, 485 
4-term analogies, 118, 119 
fractions, 560, 561 
frame of mind 
decision making, role in, 257-258 
emotions and, 258 
identities and, 257-258 
mood maintenance and, 258 
priming in, 257 
framing effects, 246 
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frequency foncere pea bby: Inttpss /A4Etiiianatrgnco medicated comparator mechanism as part of, 504 


conjunction fallacy in, 279 
in System 2 (dual process theory), 279 
fully explicit models 
in principle of iconicity, 188 
for sentential reasoning, 188, 189, 190 
truth tables vs., 190 
functional anatomy (reasoning), 479-481 
basic paradigms for, 479-481 
cerebellum effects, 481 
content/no content effects, 482 
dissociable neural networks and, 481-482 
semantic content and, 482 
stimuli presentations of, 479 
study findings for, 481-484 
functional fixedness, 332 
fuzzy set theory, 43 


GE (General Enrichment) programs, 778 
General Enrichment programs. See GE 
General Problem Solver. See GPS 
generative representation systems 
in transformational models, 26 
generic noun phrases, 53 
Genoplore model 
for creativity, 356 
exploratory phase of, 356-357 
generative phase of, 356 
geometric models (of similarity), 15-17 
problems with, 19 
symmetry assumptions within, 18 
Gestalt psychology, 3 
linguistic connections within, 9 
principles of perceptual organization in, 210 
problem solving, origins in, 324 
scientific reasoning and, role in, 706 
Story model and, 79 
gesture 
inference and, 218 
spontaneous, 215 
GPS (General Problem Solver), 324, 329 
weaknesses of, 324 
graphics 
elements, 225-22 


lon 


language vs., 232 

relations, 226 

visuospatial reasoning context for, 225 
groupthink, 776 
GUIDON consultation systems, 729 


“Hamming distances” 
in The Contrast Model, 21 
Handbook of Implicit Learning (Frensch/Stadler), 4.41 
Head Start program, 765 
hemispheric asymmetry 
belief-laden material and, 485 
belief-neutral material and, 485 
cognitive neuroscience and, 485-487 
Hemsley’s & Gray model (thought disorder), 503-506 
auditory hallucinations in, 505 
classical conditioning and, 505 


diagram, 503 
dopamine functions in, 505 
excitatory input disruption in, 505 
interrupted motor labeling programs in, 505 
“match”/“mismatch” signals within, 505 
reticular nucleus inhibition within, 505 
thalamocortical disinhibition in, 505 
heuristics 
accessibility within, 270-272 
additive extension effect in, 280, 282 
affect, 58, 258 
anchoring, 272 
attribute substitution and, 269-274 
bias and, 268, 270 
“choosing by liking” in, 286 
cognitive, use of, 111-112 
coherence rationality and, 277 
cross-dimensional mapping for, 272 
in deductive reasoning, analysis of, 173-175 
direct conjunction fallacy in, 276 
dominance violations in, 285-287 
elicitation, 274 
factorial designs in, 280-281, 287 
hill-climbing, 327-328 
identification of, 274-276 
in inductive reasoning, 111, 199 
judgments, role in, 267 
mean-ends analysis, 328-329 
medical mistakes, role in, 739 
in medical reasoning, 730 
in problem solving, 326 
prototype, 281-287 
representativeness in, 112, 199, 272, 
274-281 
Hierarchical Cluster Analysis 
and MDS, 22 
hierarchical structure models, 602, 603 
aging’s effect on, 602-604 
first-order factors for, 602, 604 
for intelligence, 754-755 
second-order factors for, 603 
hierarchies 
clustering in, 22 
in domains, 373 
Hierarchical Cluster Analysis, 22 
in MDS models, 22 
representational, 22 
tangled, 373 
in thinking (kinds), 2 
hill climbing 
means-end analysis vs., 329 
in problem solving, 327-328 
Hobbits and Orcs problem 
problem solving and, 327 
solution path for, 328 
holographic reduced representations. See HRRs 
How to Solve It (Polya), 324 
HRRs (holographic reduced representations), 81, 83 
Human Development, 540 
Human Problem Solving (Newell/ Simon), 324, 401 
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hypothesis spack rdvertete doy: Inttess /4ETiiianargncemental Analogy Model, Se IAM 


experiment spaces and, 709 
in scientific reasoning, 708 
hypothesis testing 
medical reasoning and, 731 
scientific reasoning and, 707, 709-710 


IAM (Incremental Analogy Model), 131 
ideational confusion definition, 4.96 


IE (Instrumental Enrichment) programs, 778, 782, 


787, 788 
illusory inferences, 194, 195 
imagery 
perception vs., 211-216, 218 
“Impetus” theory, 715 
intuitive theory vs., 387 
Newtonian scientific theories and, 715 
implicit cognition 
abstraction in, 442-445 
ACT-R models and, 433 
AG study for, 434 
age-based abilities in, 431 
amnesia and, 431 
definition of, 432 
“feeling of knowing” in, 445 
incubation in, 445 
information accessibility in, 433 
insight in, 445 
intuition as part of, 445 
knowledge availability in, 433 
learning in, 440-442 
memory, 432, 438-440 
methodology for, 43 3-436 
as non-rule based, 433 
opinions about, 431-432 
perceptual representations in, 443-445 
problem solving in, 445-447 
spreading activation in, 445 
storage of, 432-433 
symbol manipulation in, 433 
transfer protocols in, 443 
implicit induction, 199-200 
principle of modulation in, 199 
truth functional in, 200 
implicit learning, 440-442 
acquisitional mechanisms in, 442 
in adults, 441-442 
AG procedures in, 4.41 
artificial word segmentation in, 4.41 
in infants, 440-441 
implicit memory, 432, 438-440 
under anesthesia, 436-437 
attentional load in, 436 
context sensitive, 439 
diverted attention in, 436 
fragment completion for, 439 
“polarity” fallacy in, 440 
subliminal perception and, 437-440 
implicit relations continuum, 83 
inclusion fallacy 
in similarity-based induction, 105 


incremental learners, 786 
incubation, 445 
indirect inference rules systems. See suppositional 
reasoning 
individualism 
collectivism vs., 675-676 
inductive reasoning, 2, 95, 198-199 
associative learning systems and, 96 
availability heuristic in, 199 
categorical, 712 
categorical structures and, 100, 111 
“cause and effect” and, 95, 98, 99, 112-113 
collocation as part of, 100 
computational-level causal theory for, 153-155 
conditional probability in, 101 
contrast models of similarity and, 99 
descriptive level of, 96 
entrenchment in, 97 
by enumeration, 97 
experience as part of, 95 
explicit, 199 
by generalization, 712 
generalization gradient as part of, 99 
heuristics as part of, 111 
implicit, 199 
justificatory level of, 96 
naturalistic accounts of, 98 
predicates’ role in, 99 
problems with, 95 
projectibility, 97 
reflective reasoning and, 96 
“riddle” of, 97, 98 
as scientific methodology, 102 
in scientific reasoning, 712 
similarity-based, 102-107, 111 
tendency as instinct in, 96 
inference 
as concept, 52-53 
from diagrams, 227-229 
expertise, effect on, 231 
gesture and, 218 
illusory, 194, 195 
learning, 47 
in mental environments (visuospatial), 218-220 
in mental model theory, 192, 194 
motion in space observation, 216-217 
object-based, 332-333, 340-342 
perception of causality and, 217 
in real environments (visuospatial), 217-218 
from similarity, 13 
from sketches, 231 
transitive, 613-614 
in visuospatial reasoning, 216-221 
inference rules systems (direct) 
for deductive reasoning, 171-172, 475 
Modus Ponens, 171-172 
information processing theories, 5 31-5 32 
cascade correlation models and, 535 
dynamic systems models, 5 36-537 
Neo-Piagetian school and, 531 
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neural nd MSAK boy: ints; /AGEtiiiianargnco Mhistory of, 752-753 


Q-SOAR model, 532 


symbolic architectures within, 531 


major scales for, 753 
mental context in, 762 


information-processing (in medical reasoning), 728 physical context in, 762 


protocol analysis in, 728 
insight problems, 3 42-3 44 
anagrams and, 343 
g-dot problem as, 331 
pairs in, 343 
Instrumental Enrichment programs. See IE 
insurance, 245 
in risk decision framing, 246-247 
integrative reasoning, 593 
aging and, 594, 596 
WM and, 596 
intelligence 
analogy and, 758 
bioecological model of, 764-765 
biological approaches to, 758-760 
Bradley-Caldwell studies on, 765 
brain size and, 759 
choice reaction times for, 755-756 
cognitive approaches to, 755-758 
componential theory for, 757-758 
contextual approaches to, 760-762 
contextual differences in, 752 
conventional conceptions of, 354 
creativity and, 354-355 
emotional, 752 
environmental influences on, 766 
ethnotheories of, 668 
evolution theory of, 760 
glucose metabolism and, 759 
heredity’s role in, 766 
hierarchical models of, 754-755 
home environment and, effects of, 765 
implicit theories of, 751-752 
improvements to, 764-765, 766 
inspector time indicators for, 755 
integration effects on, 766 
mediated learning and, 782 
multiple intelligences theory, 762-763 
neural conduction speed and, 759 
PET and, 759-760 
physiological indicators for, 758 
primary mental abilities for, 75 4 
psychometric approaches to, 753-755 
simultaneous processing speed and, 756-757 
social, 752 
SOI model for, 75 4 
syllogisms’ role in, 757 
systems approaches to, 762-765 
testing, 752-753 
theory of “g” and, 754, 782 
Triarchic theory for, 763-764 
“true,” 764 
WM and, 757 
intelligence quotient. See IQ 
intelligence testing, 752-753 
culture-relevant, 761 
elements of, 753 


social context in, 762 
Stanford-Binet test, 753 
Weschler scale, 753 
intensional reasoning, 197 
“intermediate effect” 
idealized representation of, 735 
in medical reasoning, 735 
internal representations, 2 
interpretation problem (for biases), 174-175, 176 
intuition, 445 
Intuitive Math programs, 779, 780 
intuitive theory 
belief systems vs., 3.93 
declarative knowledge and, 374 
formation of, 387 
impetus theory vs., 387 
nonmonotonicity and, 387 
IPAR studies, 354 
IQ (intelligence quotient) 
certification theory in, 355 
creativity and, 354 
IPAR studies and, 354 
neural-conduction speed and, 759 
reaction times and, 756 
Terman Concept Mastery Test for, 354 
threshold theory in, 355 
irrational numbers (mathematical cognition), 561 
extracting roots operation for, 562 
isomorphism 
hypothesis of, 186 
in SAA, 76, 82 
“second-order,” 211 
in semantic alignment, 341 
iterative retrospective revaluation 
in causal learning, 159-160 


Jasper Woodbury program, 790 
Journal of Educational Psychology, 751 
judgment. See also decision making 

in heuristics, 267 

influences on, 6 

intuitive, 179 

reasoning and, 2 

similarity in, 29 

thinking and, 1 

visuospatial thinking and, 221-22 


“kernels,” 132 
Keys to Thinking program, 787, 793 
key-tapping paradigms 
baseline conditions for, 574 
in numerical estimation (animals), 573, 574,577 
kind. See also similarity 
in entrenchment theory, 98 
knowledge encapsulation, 737 
in medical reasoning, 737 
Knowledge Forum, 783, 793 
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knowledge system Te PRA Dy: Inttpss /4Etiiianarganaiage learning (production system), 421-423 


analytic vs. empirical, 145 ACT-R and, 422 

complex, 371 fact representation in, 422 

conceptual, 107 past tense generation models for, 422 

key aspects of, 121 lateral thinking, 352 

in metaphor, 121-122 “launching effects,” 145 

perceptual, 107 law. See legal reasoning 

similarity and, 15 Law and Economics movement, 694 
Kolmogorov complexity theory formalism theory and, 694 

conditional, 27 Law and Society movement, 694-695 

within transformational models (of similarity), 27 LCM (lowest common multiple), 339 

learning 

labeled graphs, 74 conceptual categorization and, 38 
language feature, 47-48 

ambiguity in, 636-637 implicit, 440-442 

anaphoric reference and, 636 signal 

bootstrapping in, 643, 649, 653 Learning and Inference with Schemas and Analogies. 

classificatory tasks for, 642-643 See LISA 

closed-class functional vocabulary and, 641 legal realism theory, 690-694 

“code-switching” in, 653 American Bar Association and, 694 

in cognitive development, 531 case comparisons in, 692-693 

components of, 49 constructive realists and, 693 

compositionality within, 52 Critical Legal Studies movement and, 695 

concept development and, 634, 635 critical realists and, 693 

conceptual functions of, 48-5 4 judges’ role in, 692 

“core knowledge” and, 635 sociocultural forces’ effect on, 691-692 

count-noun morphology in, 642 “Sociological Jurisprudence” and, 690 

deductive reasoning within, 9 sociology and, 691 

dissociation (brain functions) and, 478 statutory interpretation in, 691 

evidentiality within, 648 legal reasoning, 685 

generic noun phrases as part of, 53 analogical method (case-based) in, 686, 687-688 

Gestalt psychology and, 9 categorical thinking in, 698-699 

graphics vs., 232 certainty as part of, 699 

inconsistency in, 637-638, 650 Critical Legal Studies movement in, 694, 695-696 

inter-modular communication and, 651 decision-making in, 697-698 

labeling in, 654 deductive method (rule-based) in, 686-687 

landmark information in, 647 dispositional bias in, 699-700 

learning (production system), 421-423 Economics and, 700 

linguistic relativity and, 633, 635, 639 empirical testing for, 696 

manner-biasing in, 646 formalism theory of, 688-690 

in mental model theory, 187 group size effects on, 697 

motion expression in, 645-646 Law and Economics movement in, 694 

numerosities and, 649-651 Law and Society movement in, 694-695 

objects’ role in, 642-644 legal realism theory in, 690-694 

ontology’s influence on, 643 “noble lie” in, 695 

orientation and, 651-653 probabilistic data and, 699 

path-biasing in, 646 refinements in, 700 

perception and, influences on, 638 as science, 688 

phonemes and, 638 scientific reasoning vs., 696-700 

phonetic reorganization in, 638 Leviathan (Hobbes), 3 

polysemy in, 53-54 lexical disambiguation task, 501 

quantifiers within, 650 linguistic relativity, 633, 635, 639 

regularities in, 640 study of, 640 

similarity tests for, 655 LISA (Learning and Inference with Schemas and 

sound categorization and, 655 Analogies), 134-136 

spatial relationships in, 644-645, 646-648 ACME vs., 134-135 

spatial, spontaneous use of, 226 aging’s effects on, 87 

thinking and, 7 analogs in, 86, 135 

“thinking for speaking” and, 653, 654 code hierarchy within, 85, 86 

time and, 648-649 front-temporal degeneration and, 87 


visuospatial reasoning, effect on, 220-221 LTM in, 86 
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mapping connections in, 135 
propositional representation in, 85, 86, 134 
as representational format, 84 
role-filler bindings in, 86 
sub-proposition units in, 85 
unit distribution hierarchy, 135 
WM and, 86, 135, 413, 469, 470 
Location test, 599 
logic problems, 3 
logics systems 
cultural thought and, 676 
mental, 171 
natural, 171 
predicate calculus, 186-187 
long term memory. See LTM 
Lorge-Thorndike Verbal Intelligence test, 355 
loss aversion, 245, 247-249 
endowment effects from, 247 
stability as result of, 248 
trade reluctance from, 248 
lowest common multiple. See LCM 
LTM (long term memory). See also memory 
conjunctive coding for, 84 
in LISA, 86 


MACFAC (“Many are Called but Few Are Chosen”) 


model 
content vectors in, 133 
SME and, 133-134 
magnitudes 
answer, 574 
comparand, 573 
mental, 560, 562 
objective, 560 
operand, 573,574 
rational numbers and, 561 
Weber's law and, 569, 575 
MAM (Metric Array Module) 
LISA and, 136 
“Many are Called but Few Are Chosen.” 
See MACFAC 
mapping 
ACME networks and, 13.4 
in analogy, 117, 124-127, 410, 424, 531 
bidirectional, 579, 583 
bistable, 126, 127 
coherence in, 125-127 
cross-dimensional (for heuristics), 272 
in CWSG, 128-129 
goal-directed, 124-125 
in LISA, 135 
in metaphor, 120 
relational responses in, 127 
in SME, 132 
structure, 121 
WM in, 127-128 
“match” signals 
in thought disorder, 505, 507, 509 
matching bias, 174-176 
Wason selection tasks and, 174-176 


bidirectional mapping in, 579, 583 
children and, 579-582 
FINST system and, 581 
foundations of, 559 
infant number discrimination and, 581 
magic experiments and, 580 
numerical estimation as part of, 562-564 
object tracking systems and, 581 
verbal numerical competence and, 579-582, 583 
mathematical problem solving, 338-342 
domain knowledge in, 3 38-340 
ERPs in, 342 
LCM in, 339 
mathematical models in, 340 
object based inferences in, 340-3 42 
semantic alignment in, 341 
semantic symmetry in, 341 
subgoals generation in, 340 
mathematics 
arithmetic as part of, 560 
formalist perspective of, 559 
mental magnitudes and, 560, 562 
numbers systems and, 560, 562 
objective magnitudes and, 560 
proportion in, 561 
Pythagorean formula, role in, 561 
matrix reasoning, 591, 593 
aging and, 594, 602 
tasks, 591 
McClesky v. Kemp, 697-699 
MDS (multidimensional scaling) models 
algorithms, 16 
applications of, 16 
city-block metrics within, 16 
compressed representations in, 16-17 
contemporary uses of, 16 
The Contrast Model within, 20 
density as part of, 19 
Euclidean metrics within, 16 
Hierarchical Cluster Analysis and, 22 
hierarchical structures in, 22 
input to, 15 
inter-item distances in, 19 
item bias within, 19-20 
output of, 15 
postulated representations in, 22 
propositional structures in, 22 
quantitative representations in, 17 
of similarity, 15 
space dimensionality in, 19 
MDX-2 systems, 737 
mean-ends analysis, 328-329 
hill climbing vs., 329 
Tower of Hanoi problem and, 328 
mechanics, 387 
medial temporal lobes. See MTL 
medical education 
medical reasoning and, 741-742 
PBL in, 741 
problem solving and, 741 
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medical errors Prevertete bby: Inttess /4Etiiianargoemm ct primary, 457 


medical reasoning and, 738 domain specificity and, 60-61 
mistakes, 739 dynamic, 402 
slips, 73.9 generic, 62 
taxonomy of, 738-739 implicit cognition in, 432 
theory of action and, 739 in LISA, 86 
medical knowledge, 734-735 LTM storage and, 84 
basic scientific, 73 4 multiple component systems of, 47, 458 
clinical, 734 organization within, 42 
compiled causal, 738 primacy effects for, 458 
medical reasoning, 727 Quillian-type, 42 
artificial intelligence and, 728 recency effects for, 457 
causal reasoning as part of, 736-738 secondary components of, 457 
“cover and differentiate” reasoning models as part of, semantic, 39-41, 62 
730 separate primary, 457-458 
data-driven reasoning as part of, 732 “story,” 122 
decision research and, 740-741 mental accounting 
decision-analytic approach to, 728 in decision making, 255-256 
“deep” expertise in, 737 topicality in, 256 
directionality in, 727 mental accumulator model 
early history of, 727-729 for mental magnitudes, 5 65 
errors and, 738, 739-740 mental logic theories 
forward-oriented strategies in, 729, 731-732 cognitive neuroscience and, 479 
GUIDON consultation systems and, 729 deductive reasoning and, 475-476 
heuristic classification in, 730 inferential roles within, 475 
horizontal organization in, 734 mental model theory vs., 476 
hypothesis testing in, 731 mental magnitudes, 560, 562, 582 
hypothesis-evaluation in, 728 adult number behavior and, role in, 583 
hypothesis-generation in, 728 for duration, 565, 567 
hypothetico-deductive reasoning within, 728, 729, mapping between, 571 
731, 742 mental accumulator model for, 565 
induction in, 731 numerosities and, 568 
information-processing approach to, 728 for numerosity, 565 
“Gntermediate effect” in, 735 scalar variability and, 564, 571, 573, 582 
Internist consultation systems and, 729 subjective quantity in, 52 
knowledge encapsulation in, 737 mental model theory (deductive reasoning), 172, 186, 
MDX-2 systems in, 737 190-191 
medical education and, 741-742 analogy and, 187 
medical knowledge and, 734-735 artificial intelligence and, 187 
mental model progression in, 737 behavior simulation in, 187 
models of, 728 categorization and, 185-186 
MYCIN consultation systems and, 729 cognitive neuroscience and, 478-479 
NDM and, 740 compound premises in, 196 
NEOMYCIN consultation systems and, 729 deduction strategies in, 196 
nonmonotonicity in, 735 development of, 203 
PIP and, 729 diagrammatic systems within, 186 
QSIM systems, 737 “double disjunction” in, 193 
“real-world” clinical tasks and, 727 Euler circles in, 190, 191, 192 
“select and test” models in, 730 fault diagnosis and, 187 
“shallow” expertise in, 736 history of, 186-187 
similarity in, 733-734 induction cases as part of, 186 
technology-mediated, 743 inferences in, 192, 194, 195 
technology’s effects on, 744 isomorphism hypothesis in, 186 
“vertical” role in, 734 in medical reasoning, 737 
medical technology mental logic theory vs., 476 
EMR, 744 “minimal completion” hypothesis and, 194 
Euler circles and, 743 modal reasoning and, 193 
as external representations, 743 Modus Ponens and, 173 
medical reasoning and, 743-745 ‘picture’ theory of meaning and, 186 
memory prediction corroboration for, 193 


concepts and, 39, 40-41, 48-49 principle of iconicity in, 187-188 
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principle of truth in, 190, 194, 195 
psycholinguistics and, 187 
semantic information effects in, 185 
semantics and, 172, 187 
suppositional reasoning for, 196 
syllogisms in, 190 
tidal predictors and, 186 
truth-functional meanings in, 188 
updating in, 190 
mental models 
“runnable,” 228, 331 
for visuospatial reasoning, 228 
Mental Models (Gentner/Stevens), 228 
Mental Models (Johnson-Laird), 228 
mental representations, 2 
mental scanning tasks, 213-214 
mental transformations, 213-214 
application of, 214 
dissociability in, 215 
“move” as part of, 214 
“rotate” as part of, 214 
of self, 214-215 
metaphor 
analogy and, 120-121 
mapping in, 120 
metonymy in, 120 
source domain in, 120 
tenor (target) as part of, 120 
time and, 120-121 
metonymy, 120 
metrics 
city-block, 16 
Euclidean, 16 
power, 19 
MindStorms (Papert), 721 
“minimal completion” hypothesis, 194 
mirror image transformations, 26 
mistakes (medical errors), 739 
action specification, 740 
evaluation, 739 
execution, 739, 740 
goal, 739 
heuristics and, 739 
intention, 739 
medical reasoning and, 739-740 
perception, 740 
procedural, 740 
modality specificity 


in Embedded-Processes memory model, 


463 
for WM, 467, 470 
Modified Digit Span task. See MODS 
MODS (Modified Digit Span task) 
in ACT-R, 413 
sample trial for, 413 
Modus Ponens, 171-172, 178, 179 
mental model theory and, 173 


Modus Tollens, 171, 172, 173, 174-175, 178 


material implications in, 172 
truth table analysis of, 172 


in declarative knowledge, 386 
in similarity-based induction, 104 
motion 
abstract paths for, 217 
expression, in language, 645-646 
self-propelled, 217 
motivated thinking 
affective thinking, influence on, 311 
cognitive vs., 296 
history of, 295-296 
“New Look” school of, 296 
outcome, 296-306 
strategy, 306-310 
motivations 
for creativity, 358 
drives and, 295 
expectancies and, 295 
spreading activation and, 295 
MTL (medial temporal lobes) 
schizophrenia, role in, 512, 513-514, 515 
structures, role of, 439 
subliminal perception, role in, 439 
Muller v. Oregon, 693 
multidimensional scaling models. See MDS 
multiple component working memory models, 
458-462 
articulatory loop as part of, 459 
central executive as part of, 458, 460 
Embedded-Processes, 462-463 
episodic buffer in, 458, 461-462 
four components for, 458-459 
phonological loop in, 458 
slave systems for, 459 
task-interference paradigm in, 463-466 
visuospatial sketchpad in, 458, 459-460 
multiple intelligences theory, 762-763 
cognitive modularity as part of, 762-763 
multiple systems 
in categorization (conceptual), 46-47 
for memory, 47 
multiplicative binding schemes, 82 
MYCIN consultation systems, 729 


naive biology domain, 61 
domain specificity and, 59 
naive law-school domain, 61 
naive physics domain, 61 
domain specificity and, 59 
naive theory (inductive reasoning), 108-109 
age-based knowledge systems and, 109 
human bias as part of, 109 
naming effects, 105-106 
in similarity-based induction, 105 
“Natural Kinds” (Quine), 98 
naturalistic decision making. See NDM 
The Nature of Explanation (Craik), 186 
The Nature of the Judicial Process (Cardozo), 
686 
n-back task, 466 
NDM (naturalistic decision making), 740 
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double negation effects in, 174-176 
explicit negation as part of, 175-177 
implicit negation as part of, 175-177 
RAA reasoning and, 174-176 
négritude 
cultural thought and, 666 
neologism 
thought disorder and, 495 
NEOMYCIN consultation systems, 729 
Neo-Piagetian school (cognitive development) 
information processing theories and, 531 
neural net models, 5 32-535 
backpropagation in, 533 
balance scale and, 532 
balance state in, 534 
cluster analysis for, 536 
in information processing theories, 532 
prototype formation in, 545 
rules in, 534 
symbolic processes in, 535, 536 
systematicity in, 536 
three-layered, 536 
neural network models, 5 
categorization in, 46 
New Directions in Scientific and Technical Thinking 
(Gorman), 708 
g-dot problem, 230, 331 
creativity and, 356 
as insight problem, 331 
solution for, 331 
“No Peeking Principle,” 50 
concepts and, 50 
“noble lie” (legal reasoning), 695 
non-directional outcome-motivation, 296, 300-301 
accuracy in, 301 
analytic complexity in, effects of, 302 
attribution, effects on, 301 
closure in, 301 
cognition and, need for, 312 
evidence evaluation, effects on, 301 
fear of invalidity in, 301 
knowledge activation, effects on, 3.03 
recall, effects on, 302-303 
non-human primates 
analogical reasoning in, 611-613 
causal reasoning for, 619-625 
Cognitive Revolution, role in, 593 
conjunctive negation in, 615-616 
conservation experiments for, 618-619 
counting ability in, 616-617 
detour use by, 608 
DMTS tasks and, 612 
identity tasks and, 610 
inferential reasoning in, 613-616 
least-distance strategies in, 609 
moving object search by, 609 
object permanence tasks by, 609 
oddity conceptualization in, 610 
ordinality for, 614-615 
perceptual strategy use by, 619 


quantitative reasoning in, 616-619 
relational reasoning in, 609-611 
rotational displacement and, 609 
sameness-difference comprehension in, 610-611 
shortcut use by, 608-609 
social reasoning for, 625-626 
spatial reasoning in, 608-609, 612 
stick and hook tasks for, 621 
summation tasks for, 617-618 
support problems and, 620-621 
“theory of mind” and, 626 
tool use by, 619-620 
transitivity in, 613-014 
tube and trap problems and, 621-623 
nonmonotonicity 
bootstrapping and, 389-390 
feature coverage model analysis for, 104-105 
intuitive theory and, 387 
in medical reasoning, 73 
ontological shifts as result of, 3.92 
replacement as part of, 390-391 
in similarity-based induction, 104, 110 
transfer via analogy in, 391-392 
normal speech production models, 498, 507 
diagram, 496, 498 
lemma retrieval in, 496-497 
on neural levels, 498 
“rhetorical/semantic/syntactic” system within, 
496 
thought disorder and, 496-498 
normative system problem (for biases), 173-175 
notion of overhypotheses, 106 
Novum Organum (Bacon), 706 
number-left procedures, 567 
flash generation in, 567 
variable ratio schedules for, 567 
numbers systems (mathematics), 560, 562 
bit pattern symbols in, 564 
bit patterns within, 
defining features of, 566 
fractions, 560 
irrational, 561 
measuring quantities in, 561 
negative, 561 
notation system for, 74, 562 
rational, 560 
real, 562 
types of, 560-562 
numerical estimation (animals), 563, 564 
key-tapping paradigms in, 573, 574 
in mathematical cognition, 562-564 
multiplying rates in, 568 
nonverbal arithmetic reasoning and, 572-574 
nonverbal counting, 571-572 
number-left procedures in, 567 
numerosity in, 564, 566 
rate of reward in, 568 
reaction time/accuracy in, 570 
subtracting durations for, 566-567 
time-left procedures in, 566-567 
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adding, 566 

difference judging experiments with, 576 
discrimination experiments with, 576 
division of, 567-568, 591-593 

dot, 577 

dot-arrays and, 572 

estimating mechanism for, 577 

language and, 649-651 

map, 552 

mental magnitudes and, 568, 569 
nonverbal estimated, 572 

in numerical estimation (animals), 564 
objective, 576 

ordering, 568-569 

rate of reward and, 568 

symbolic distance/size effect and, 547, 569-571 
tone sequences for, 572 


Odyssey program (thinking), 779, 781 


“On Problem Solving” (monograph) (Duncker), 324 
On Scientific Thinking (Doherty/Mynatt/Tweney), 708, 


710 
“one-back” continuous performance task, 501 
ontology 

language’s influence on, 643 
organizational models 

horizontal, 73 4 

of memory, 39-40 
outcome-motivated thinking, 296-306 

directional, 296, 297 

limits to, 303-305 

non-directional, 296, 300-301 
overshadowing, 148 


patterns 
in connectionist representations, 78 
explanation, 375 
matching (production rules), 402 
transmission (schizophrenia), 511 
PBL (problem-based learning), 741 
backward-directed reasoning modes and, 
742 
reasoning errors as result of, 742 
Peak/End Rule, 254 
application of, 284 
People Pieces analogy task, 465 
perception 
in causal learning, 145, 162 
constraints in, 15 
constructive, 231 
functions in, 15 
language’s effect on, 638 
mistakes (medical errors), 740 
subliminal, 437-440 
perceptual knowledge 
conceptual vs., 107 
in similarity-based induction, 107 
perceptual representation systems, 443-445 
context effects in, 444 
task demands in, 44.4 


PET (positron emission topography) 
intelligence and, 759-760 
in scientific reasoning, 716 

phase shift transformations, 27 
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Philosophy for Children program (thinking), 778, 779, 


787 
PMI operation as result of, 778 
phonemes, 638 
phonological loop 
articulatory suppression in, 459 
in multiple working memory models, 458 
phonological store within, 459 
role of, 459 
‘picture’ theory of meaning, 186 
PIP (Present Illness Program), 729 
plans (declarative knowledge), 375 
polysemy, 53-54 
linguistic differentiation within, 54 
typicality structures in, 53 
Popular Science, 706 
positive self-evaluation 


within directional-outcome motivation, 298, 


300 
individualistic cultures and, 312 
strategies for, 300 

positron emission topography. See PET 

Posner’s trains-and-bird problem, 322, 325 
alternative representations for, 322 

power metrics, 19 

Power PC theory 
Boyle’s law and, 153 
in causal learning, 153, 154,155,160 
ceiling effects and, 155 
deviations from, 158, 159 
flexibility of, 160 

PPP (Premise Probability Principle) 
in scientific methodology, 108 

pragmatic reasoning schemas, 173 
in Deontic Selection Task, 177 

predicate calculus (logic system), 186, 

187 

predicates 
ACME matching, 134 
in category structure, 100 
inductive reasoning, role in, 99 
in similarity-based induction, 103 

prediction 
in causal learning, 144, 151, 152 
for concepts, 38 

predictive learning 
causal learning vs., 147 

preferences 
in strategy-motivated thinking, 309-310 

Premise Probability Principle. See PPP 

Present Illness Program. See PIP 

prevention focus 
promotion focus vs., 307 
in regulatory focus theory, 306 
subtractive counterfactuals and, 308 

primacy effects, 458 


850 


SUBJECT INDEX 


primary memori? rdvertete doy: Inttpss /4ETiianargoGaitn ematical, 338-342 


distinct, 457 
“focus of attention” vs., 462 
separate, 457-458 
priming 
in frame of mind, 257 
momentary, 257 
principle of equiprobability, 197 
sentential reasoning models in, 197 
principle of iconicity (mental model theory) 
exclusive disjunction table for, 188 
formal logic in, 188 
fully explicit possibility models in, 188, 189 
in mental model theory, 187-188 
sentential reasoning in, 188, 190 
principle of “indifference” 
in probabilistic reasoning, 197 
principle of least commitment 
in similarity, 15 
principle of modulation, 199 
principle of strategic variations 
in mental model theory, 191-192 
principle of truth (mental model theory), 190, 194, 195 
in deductive reasoning, 172 
mental footnotes as part of, 190 
probabilistic reasoning, 197 
extensional reasoning and, 197, 198 
intensional reasoning and, 197 
principle of equiprobability, 197 
principle of “indifference” in, 197 
subset principle in, 198 
probability 
conditional, 101 
extensional representations of, 102 
in prospect theory, 245 
problem representations, 330, 335 
expertise and, 336 
four components of, 330-331 
functional fixedness in, 332 
runnable mental models as, 331 
problem schemas 
in analogy, 130 
problem solving 
algorithmic solution strategies in, 325 
analogous experience and, role in, 335 
analogy and, 122 
change variant in, 333 
complex learning and, 6 
context in, 331-332 
convergence solution in, 334 
expertise, role in, 336-338 
general memory schemas and, 335-336 
Gestalt psychology and, 324 
goal state in, 322 
GPS in, 324, 329 
heuristic solution strategies in, 326 
hill climbing, 327-328 
history of, 324-325 
Hobbits and Orcs problem and, 327, 328 
ill-defined problems in, 330 
initial state in, 322 


mean-ends analysis in, 328-329 

medical education and, 741 

in medical reasoning, 728 

object based inferences in, 3 32-333 

paired, 780 

Posner’s trains-and-bird problem and, 322 

problem spaces in, 326-327 

production rules in, 424 

progress monitoring theory in, 344 

reasoning and, 2 

representational change theory and, 34.4 

representations as part of, 330-331, 335 

scientific thinking as, 708-709 

solution comprehension in, 338 

solver knowledge, role in, 333-334 

story content in, 332-333 

subgoals in, 329 

Tower of Hanoi problem and, 322, 325, 329 

transfer variant in, 333 
Problem Solving and Comprehension (Whimby), 779 
problem spaces 

computer simulations for, 327 

current knowledge state in, 326 

in problem solving, 326-327 

in scientific thinking, 708, 709 

searchs for, 327 

think-aloud protocols, 327 

in Tower of Hanoi problem, 326 
problem-based learning. See PBL 
procedural knowledge, 371-372 

acquisition of, 372 
process-dissociation procedures, 438 
production rules, 401 

abstract, 403 

in ACT-R, 405 

asymmetry in, 403 

in EPIC, 412 

illustrative examples of, 401, 402, 403 

modularity in, 402-403 

in problem solving, 424 

in Soar, 405 

verbalization of, 403 
production systems (thinking), 4 

abstract, 403 

action execution in, 402 

ACT-R, 404 

AMBR and, 425 

analogy and, 402, 409-412 

asymmetry in, 403 

background on, 401-404 

categorization, 415-417 

choice in, 406-408 

conflict resolution in, 402 

conflict sets in, 402 

dynamic memory in, 402 

EPIC, 404 

4-CAPS, 404 

future applications for, 425-426 

hybrid view of, 404 

knowledge content in, 403 
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modularity in, 402-403 neurophysical studies for, 119 

noise processes, role in, 404 RPM tests in, 118, 119 

pattern matching in, 402 Pythagorean formula, 561 

problem-solving choice in, 408-409 

rules within, 401, 402 QSIM systems, 737 

skill learning, 418-420, 421 Q-SOAR model 

Soar, 404 chunking in, 532 

verbalization in, 403 in conservation, 544 

WM, 412-414 information processing theories and, 532 
Productive Thinking (Wertheimer), 324, 706 Quillian-type memory organization, 42 
progress monitoring theory, 3.4.4 

representation change theory vs., 344 
Progressive Education Association, 720 
projectibility 

in inductive reasoning, 97 

problems with, 97 
promotion focus 

additive counterfactuals and, 308 

prevention focus vs., 307 

in regulatory focus theory, 306 
propositional notation, 74 
prospect theory (decision making), 244-246 

concave utility functions in, 245 coherence, 277 

insurance and, 245 Raven’s Progressive Matrices. See RPM 

loss aversion in, 245 reasoning 


RAA (Reductio ad Absurdum), 172 
in negative conclusion bias, 174-176 
random number generation 
syllogisms, effect on, 464 
in task-interference paradigm, 
RAT (Remote Associates Test), 355 
rational numbers (mathematical cognition), 560 
magnitude measurement and, 561 
rational theory of choice, 243, 244 
preferences in, 244 
rationality concepts 


probabilities in, 245 
risk aversion in, 244 
risk seeking in, 245 
value function of, 245 
prototype heuristics, 281-287 
base rate neglect in, 282 
duration neglect and, 282 
extensional attributes and, 282 
scope neglect in, 282 
prototype models (categorization), 44-45 
differential forgetting in, 45 
distortion within, 44, 45 
psychodynamic psychology 
creativity and, approaches to, 353-354 
methodology of, 354 
psychological essentialism, 56-57 
in cultural thought, 674 
evidence for, 56 
family-resemblance categories within, 57 
minimalist alternativism in, 57 
restrictions within, 58 
“psycho-logics,” 529 
psychology 
clinical, 4 
Gestalt, 3 
similarity assessments within, 13 
of thinking, 3 
psychometaphysics, 38, 58, 63 
concepts and, 55 
psychometric tradition (analogy), 118-120 
computational models for, 119 
in creativity, 354-3506 
crystallized intelligence in, 118 
fluid intelligence in, 118 
4-term analogies in, 118, 119 


abductive, 730 

aging, effects on, 567-568, 590, 591-593 
analogical (non-human primates), 611-613 
analytical tasks for, 591 

in categorization, 45 

causal (non-human primates), 619-625 
cognitive development and, 5 41-5 43 
collective cultural activities and, 626 
decision making and, 2, 251-252 
deductive, 2, 169 

emotions and, 312 

functional anatomy of, 479-481 

given information in, 209-210 
horizontal, 73 4 

hypothetico-deductive, 728, 729, 742 
inductive, 2, 95, 96 

inferential (non-human primates), 613-616 
integrative, 591, 593,594 

intensional, 197 

legal, 685 

matrix tasks for, 591, 593 

medical, 727 

probabilistic, 197 

problem solving within, 2 

with quantifiers, 197 


quantitative (non-human primates), 616-619 


reflective, 96 

relational, 197, 609-611 

representations in, 210 

scientific, 696-700 

sentential, 188 

series completion tasks for, 591 

Shipley Abstraction series test for, 592 

social (non-human primates), 625-626 
spatial (non-human primates), 608-609, 612 
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specific-to-specific, 733 
suppositional, 171 
syntactic approach to, 171 
thinking and, 2 
transitive, 454 
varieties of, 6 
visuospatial, 209, 210 
WAIS IIL, 590 
reasoning rationality, 277 
“four card” problem and, 277 
reasoning tasks 
for AG, 434 
analytical, 591 
for concept of mind, 547 
divergent thinking, 354 
DMTS, 612 
DMTS (non-human primates), 612 
false-belief, 547, 673 
identity (non-human primates), 610 
matrix, 591, 593 
mental scanning, 213-214 
object permanence (non-human primates), 609 
series completion, 591, 594 
stick and hook (non-human primates), 621 
summation (non-human primates), 617-618 
visuospatial, 219, 227 
Wason selection, 174-176 
recency effects, 457 
Reciprocal Teaching (Brown/Palincsar), 779, 
781 
Reciprocal Teaching program, 780 
Reductio ad Absurdum. See RAA 
regulatory focus theory 
counterfactual thinking in, 308 
prevention focus as part of, 306 
promotion focus as part of, 306 
in strategy-motivated thinking, 306 
relational complexity metric, 539, 548 
CCC theory and, 539 
conceptual chunking within, 539 
segmentation in, 539 
relational reasoning, 73-76 
arguments within, 87-88 
childhood development and, 73 
flexibility within, 75 
generalization within, 77 
manipulation of, 73 
semantics within, 75-76 
symbolic representations within, 74-75 
systematicity in, 74 
release-from-overshadowing condition, 
155-156 
religion 
cultural thought and, factor in, 670 
Remote Associates Test. See RAT 
replacement systems 
bottom-up, 390-391 
evolution theory and, 391 
in nonmonotonicity, 390-391 
top-down, 391 


compressed, 16-17 
computer languages (programming), 74 
conjunctive features and, 23 
distortions program and, 221-222 
Gestalt principles of perceptual organization, 210 
hierarchical, 22 
internal, 2 
labeled graphs, 74 
mathematical notation, 74 
mental, 2 
perceptual, 443-445 
postulated, 22 
problem, 330, 335 
propositional notation, 74 
quantitative, 17 
relational, 74-75 
simple features and, 23 
transformations, effect on, 210 
visuospatial, 210-211 
representational change theory, 344 
chunk decomposition in, 344 
constraint relaxation in, 344 
progress monitoring theory vs., 344 
representativeness (in heuristics), 274-276 
computations for, 288 
conjunction items in, 276 
controversy over, 276-281 
elicited, 274, 275, 276 
group participants and, 274 
reversal transformations, 27 
risk 
aversion, 244 
decision framing, 246-247 
seeking, 245 
“The Road Not Taken” (Frost), 120 
role-filler binding 
LISA and, 86 
by synchrony of firing, 84 
tensor products and, 82, 83 
by vector addition, 83-87 
roles (analogy), 410 
Romeo and Juliet (Shakespeare), 123 
route perspective, 220 
in visuospatial reasoning, 220 
RPM (Ravens Progressive Matrices) task 
WM and, 467, 468 
performance graphics for, 119 
for psychometric tradition (analogy), 118, 119 
rules-plus-exception model. See RULEX 
RULEX (rules-plus-exception) model, 416 
in ACT-R production system, 416 


SAA (symbol-argument-argument) notation 
argument binding in, 76 
connectionist criticism of, 78 
connectionist representations vs., 75, 86 
external structure use in, 78 
ill-typed, 78 
implicit role information in, 77 
isomorphism in, 76, 82 
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schemes in, 76 
semantics in, 77 
SME and, 88 
as symbolic relational representation, 76-78 
tensor products and, 82, 83 
SAT (Scholastic Aptitude Test), 355 
scalar variability 
mental magnitudes and, 564, 571, 573, 
582 
patterns in, 572 
Weber's law vs., 582 
schemas 
convergence, 130 
in declarative knowledge, 374-375 
permission, 177, 485 
pragmatic reasoning, 173 
problem, 130 
spatial diagram, 335 
transfer processes for, 391 
schizophrenia, 494 
in Cohen & Braver’s model (thought disorder), 
500-503 
contextual bias in, 510 
dopamine receptor binding and, 512 
fetal hypoxia and, 514 
genetic epidemiology of, 511 
genetic factors for, 513 
hippocampus abnormalities and, 513, 514 
interpersonal deficits in, 516 
long-term memory and, effects on, 512 
MTL structures and, 512, 513-514, 515 
neural system abnormalities in, 511-512 
prefrontal cortex and, 511, 512-513 
thought disorder and, 494, 495 
transmission patterns in, 511 
verbal communication and, 516 
verbal declarative memory deficits and, 514 
WM and, effects on, 512 
WMS and, 513 
Scholastic Aptitude Test. See SAT 
science education 
history of, 719-720 
inquiry-based approach to, 720 
scientific reasoning and, 719-721 
scientific methodology (inductive reasoning), 102, 
107-111 
Bayesian models as part of, 110-111 
computational models, 109-111 
diversity principle as part of, 108 
hypothesis evaluation in, 110 
induction rules within, 107-108 
naive theory as part of, 108-109 
PPP in, 108 
prior beliefs as part of, 110 
variability/centrality as part of, 107-108 
scientific reasoning 
analogy use in, 713-714 
brain analysis and, 716-719 
causal thinking as part of, 710-712 
computational models for, 719 


“confirmation bias” in, 709 
deductive reasoning as part of, 712-713 
empirical testing for, 696-697 
ERP use in, 716 
experiment spaces as part of, 708 
fMRI use and, 716 
Gestalt psychology and, 706 
hemispheric differences in, 718 
history of, 705-708 
hypothesis spaces as part of, 708 
hypothesis testing as part of, 707, 709-710 
“Impetus” theory in, 715 
inductive reasoning as part of, 712 
legal reasoning vs., 696-700 
Novum Organum (Bacon) and, 706 
PET use for, 716 
as problem solving, 708-709 
science education and, 719-721 
statistical data’s role in, 710 
unexpected findings and, 710, 711 
scope neglect (prototype heuristics), 282 
Bayes rule application in, 283 
CVM, 282 
in WTP, 282-283 
scripts (declarative knowledge) 
in “theory of mind,” 671 
“select and test” models (medical reasoning), 
730 
four inference types in, 730 
semantic alignment 
isomorphic problems and, 341 
in mathematical problem solving, 341 
semantic framing, 249. See also attribute framing 
semantic memory, 39-41, 62. See also memory 
domain specificity and, 60 
fragmentation of, 41-44 
information overlap in, 42 
lexical decision priming, 42 
natural concepts and, 40-41 
organizational models, 39-40 
“semantic memory marriage,” 39 
semantics, 42-44 
comprehension within, 52 
in connectionist representations, 75 
in declarative knowledge, 373 
fuzzy set theory within, 43 
latent analysis of, 52 
linguistic sentence meaning in, 43 
mental model theory and, 185, 187 
phrase interpretation in, 52 
in relational representations, 75-76 
in SAA, 77 
tensor products and, 82 
sentential reasoning 
fully explicit models for, 188, 189, 190, 
199 
in principle of iconicity, 188 
series completion tasks, 591 
aging and, 594 
Shipley Abstraction series test, 592 


854 


SUBJECT INDEX 


short-term menhFPIePeee doy: Inttpss /4Etiianargmolaarning (production system), 418-420, 421 


capacity of, 458 
SIAM model (alignment-based), 24, 25 
similarity 
alignment-based models of, 24-26 
assessments of, 13, 30-31 
asymmetry in, 18, 20 
automatic assessments of, 30 
in cognition, 14 
comparative analysis within, 27 
contrast models of, 99 
diagnosticity effects in, 29 
dissociations within, 28 
as domain-general information source, 
14 
in entrenchment theory, 98 
expertise and, 29 
extension effects in, 29 
featural models of, 17-24 
feature learning and, 48 
flexibility of, 15, 29-30 
“generic” assessments of, 30 
geometric models of, 15-17 
inclusion, 105 
induction, 102-107 
inference from, 13 
judgments in, 29 
limitations of, 14 
mandatory considerations of, 29 
MDS models of, 15 
in medical reasoning, 733-734 
mental entity structures within, 15 
perceptual constraints and, 15 
practical applications of, 15 
principle of least commitment in, 15 
psychological assessments of, 13 
reasonable expectation and, 13 
Standard Geometric Models of, 18 
transformational models of, 26-28 
similarity-based induction, 102-107 
age-based sensitivity in, 103 
asymmetry in, 103 
basic level bias in, 107 
category coverage as part of, 103 
diversity as part of, 103 
feature exclusion in, 104 
folk-generic levels in, 106 
inclusion fallacy and, 105 
inclusion similarity in, 105 
inference in, 106 
monotonicity/nonmonotonicity in, 
104 
naming effects as part of, 105 
notion of overhypotheses, 106 
predicates as part of, 103 
preferred levels in, 106 
similarity in, 102 
typicality in, 102-103 
sketches 
constructive perception and, 231 
inferences from, 231 


ACT-R and, 419 
composition and, 419 
fact representation in, 419, 420 
proceduralization and, 419 
Soar and, 418 
slave systems 
in multiple working memory models, 459 
slips (medical errors), 739 
evaluation, 739 
execution, 739 
SME (Structure Mapping Engine) 
algorithm in, 132-133 
as analogy model, 132-134 
CWSG as part of, 132 
“deep” mapping in, 132 
“kernels” as part of, 132 
“local-to-global” direction in, 132 
MACFAC and, 133-134 
predicate-calculus notation as part of, 132 
in SAA notation, 88 
Soar (production system), 404 
chunking in, 419 
EPIC combination with, 425 
limitations for, 412 
production-rule learning in, 405 
serial processing in, 405 
skill learning and, 418 
WM in, 413 
social intelligence, 752 
“Sociological Jurisprudence,” 690 
Socratic dialogue, 775 
SOI (structure-of-intellect) model (intelligence), 754, 
782 
sortalism, 57-58 
developmental psychology, role in, 57 
identity-lending categories within, 57 
source analogs, 117, 122, 714 
spatial diagram schemas, 335 
generality level of, 335 
matrices, 335 
spatial framework tasks 
theory for, 219 
in visuospatial reasoning, 218 
spatial tapping, 465 
spatter codes, $1, 83 
speech disorders 
anomic aphasia, 499 
programming deficit, 499 
thought disorder vs., 495-499 
spreading activation 
attributes and, 249 
in implicit cognition, 445 
motivations and, 295 
standard economic model (decision making). See 
rational theory of choice 
Stanford-Binet test, 753 
status quo bias, 248-249 
Story Gestalt model (story comprehension), 79 
testing for, 79 
“story memory,” 122 
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strategy-mbukaae Bale biy: Inttps /Gatiinianargneo Mgeneralization within, 77 


accuracy vs. speed in, 308-309 


alternative hypothesis and, effects on, 306-307 


counterfactual thinking, effects on, 307-308 
eager vs. vigilant, 307, 309 
future applications for, 310-311 
preferences in, 309-310 
preferred, 306 
recall and, effects on, 309 
regulatory fit in, 309 
regulatory focus theory in, 306 
Stroop task, 272 
schizophrenia and, 501 
Structure Matching Engine. See SME 
The Structure of Scientific Revolutions (Kuhn), 
380 
subliminal perception, 437-440 
abstract representation in, 438 
brain regions for, 437 
emotional preferences, effects on, 437 
“mere exposure” effect in, 438 
MTL and, 439 
process-dissociation procedures in, 438 
semantic priming as result of, 437 
subset principle, 198 
subsymbolism 
in ACT-R, 406, 408, 422 
constituents within, 406 
proportion exemplar model for, 417, 418 
“superposition catastrophe,” 83 
effects of, 83 
Supervisory Attentional System, 460-461 
capacity limits for, 461 
Contention Scheduler as part of, 460 
schemata in, 460 
suppositional reasoning, 171 
Conditional Proof, 171 
for mental model theory, 196 
Modus Tollens, 171 
RAA and, 172 
Sure Thing Principle, 251 
survey perspective, 220 
in visuospatial reasoning, 220 
syllogisms 
in deductive reasoning, 169, 713 
figures as part of, 170 
four statements of, 170-171 
intelligence and, 757 
in mental model theory, 190 
model adjustment for, 191 
moods as part of, 170 
random number generation’s effect on, 464 
in task-interference paradigm, 464 
visuospatial sketchpad and, 464 
symbol-argument-argument notation. See SAA 
symbolic distance effect, 22 
symbolic distance/size effect, 547, 569-571 


symbolic relational representations, 74-75. See also 


relational reasoning 
conjunctive connectionist, 81-83 
connectionist (traditional), 78-81 


SAA, 76-78 
symmetry 
in distortions program, 222 
in mathematical problem solving, 341 
in Standard Geometric Models (for similarity), 18 
synchrony of firing, 84 
in frontal cortex, 84 
limitations of, 84 
in primate visual cortex, 84 
synectics, 353 
System 1 (dual process theory), 180, 267 
impression of distance in, 269 
judgment problems and, 267 
System 2 (dual process theory), 180, 267 
additive extension effect in, 280, 282 
attention manipulation in, 279-280 
attribute substitution in, 273 
frequency format in, 279 
intelligence and, 278-279 
intuitive judgments within, 273 
judgments of distance in, 269 
proposal quality and, 267 
statistical sophistication, effect on, 278 
Stroop task for, 272 
time pressure effects on, 273 
within-subjects factorial designs in, 280-281 
systematic transformations, 2 
systematicity 
in neural net models, 5 36 
in relational thinking, 74 
three-layered nets and, 536 


target analogs, 117, 714 
stories as part of, 123-124 
task-interference paradigm, 463-466 
analogy and, 465 
articulatory suppression effects within, 465 
memory loads for, 463 
random number generation in, 
spatial tapping, effects of, 465 
verbal syllogisms in, 464 
temporal contiguity 
in associative learning theory, 147 
in causal learning, 147 
temporal discounting 
in decision making, 256-257 
excessive, 257 
tensor products, 81 
definition of, 81-83 
limits of, 82 
multiplicative binding schemes and, 82 
relational binding representation and, 82 
relational generalization support from, 82 
role-filler binding, 82, 83 
SAA-isomorphic, 82, 83 
semantic relation content and, 82 
Terman Concept Mastery Test, 354 
theory of “g” (intelligence), 754, 782 
correlate eduction in, 75 4 
experience apprehension in, 75 4 


856 SUBJECT INDEX 


theory of “g” (cdr pVercete doy: ints /4Etiianargo@erprocal Teaching program for, 780 


factor analysis as part of, 754 
mental energy in, 754 
relation eduction in, 754 
“theory of mind” 
concept of mind vs., 547 
cultural thought and, 670-674 
in non-human primates, 626 
script knowledge in, 671 
Think (Adams), 779 
thinking, 2. See also reasoning 
affective, 311 
beliefs and, 1 
cognitive stages for, 529-530 
computer simulations for, 4 
counterfactual, 298, 307-308 
cultures of, 796 
definition of, 2, 780-781 
development of, 529 
dispositional of, 777, 785 
early influences on, 529-531 
educational systems’ and, 776 
foresight and, 1 
Gestalt school and, 529 
groupthink and, 776 
hierarchy of kinds in, 2 
high-end, 776 
instruction, 777 
internal representations from, 2 
judgment and, 1 
lateral, 352 
mental representations and, 2 
modern conceptions of, 4 
motivated, 295-296 
norms in, 781, 752 
in practice, 7 
production systems and, 4 
“psycho-logics” and, 529 
psychology of, 3 
reasoning and, 2 
relational, 73-76 
results attainment for, 777-750 
self-regulation in, 529 
Socratic dialogue and, 775 
speech and, 641-642 
structure in, 529 
systematic transformations and, 2 
theory of action in, 781 
transfer processes in, 777 
“thinking for speaking,” 653, 654 
“thinking hats,” 353 
“thinking” instruction, 777 
CoRT program for, 778, 779, 781 
DAPAR test for, 778 
GE programs for, 775 
IE programs for, 778 
Odyssey program for, 779, 781 
paired problem solving in, 750 
Philosophy for Children program, 778, 779 
Problem Solving and Comprehension (Whimby), 779 
Reciprocal Teaching (Brown/Palincsar), 779, 781 


Think (Adams), 779 
Think and Intuitive Math programs, 779, 780 
transfer effects in, 780 
The Thinking Classroom (Jay/Perkins/Tishman), 787 
thought disorder, 494 
capacity allocation in, 508-509 
cognitive deficit integration and, 509-510 
Cohen & Braver’s model, 500-503 
context attention in, 507-508 
“controlled attention” in, 516-517 
definition of, 495-496 
“derailment” in, 496, 509 
endophenotype approach to, 510-511, 514-515 
environmental factors for, 511 
executive system functioning in, 514 
form vs. content in, 516 
Hemsley’s & Gray model, 503-506 
heritability of 511 
hyper-priming hypothesis and, 507 
hypothesized deficits and, 506 
ideational confusion definition in, 496 
“match” signals in, 505, 507, 509 
memory retrieval interference as result of, 508 
negative, 509 
neologism production as result of, 495 
normal bias yielding, 508 
normal speech production models and, 496-498 
pathology of, 494 
phonological/phonetic system and, 506 
positive vs. negative, 496 
schizophrenia and, 494, 495 
self-monitoring within, 509 
semantic retrieval in, 507 
social cognitive neuroscience applications for, 516 
speech disorder vs., 498-499 
symptom taxonomy for, 495 
trait deficits of, 510-511 
threshold theory, 355 
time 
in metaphor, 120-121 
time-left procedures, 566-567 
Torrance Tests of Creative Thinking, 354 
Tower of Hanoi problem, 322, 325, 329, 333 
mean-ends analysis and, 328 
possible solutions for, 322, 325-326 
problem space in, 326 
Transfer Appropriate Processing, 789 
transfer of learning processes, 788-792 
expertise in, 790-791 
far, 790 
history of, 788-789 
interventions as part of, 789 
transfer protocols 
abstract analogy in, 444 
chunk strength in, 444 
in implicit cognition, 443 
transformational models (of similarity), 26-27, 28 
generative representation systems in, 26 
global consistent correspondences in, 27 
Kolmogorov complexity theory within, 27 
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memorized Prensa boviibttos Higa ti ianargncom' second-order isomorphisms” and, 211 


mirror image, 26 
phase shift, 27 
reversal, 27 
rigid, 26 
stimuli pairs within, 26 
wave length, 27 
transformations 
elementary, 215-216 
mental (objects), 213-214 
in visuospatial reasoning, 212-215 
transitive inference 
associative, 613 
in non-human primates, 613-614 
transitive reasoning 
dissociation and, 484 
triangle inequality assumption, 18 
violations in, 18 
Triarchic theory for Successful Intelligence, 762, 
763-764 
experience’s role in, 763 
external world relations in, 763 
internal world relations in, 763 
“true” intelligence, 764 
experience as aspect of, 764 
neural intelligence as aspect of, 764 
reflective aspect of, 764 
truth tables 
fully explicit models vs., 190 
Modus Tollens, analysis of, 172 
typicality effects, 4.4 
in similarity-based induction, 102-103 


unconscious thought. See implicit cognition 
Unusual Uses Test, 354 


variability 
in scientific methodology, 107-108 
sensitivity to, 108 
Variations in Value Orientation (Strodtbeck), 663, 664 
veridicality 
consistency vs., 381 
Vision (Marr), 5 
visuospatial reasoning, 209, 210 
abstract concepts in, 22 
analogs in, 225 
brain pathways and, 216 
diagrams and, 219, 227 
distortions program and, 221 
framework tasks in, 218 
graphics context and, 22 
individual differences in, 216 
judgments and, 221-224 
language effects on, 220-221 
mental models for, 228 
mental rotation tasks for, 212, 216 
mental scanning tasks for, 213-214 
narratives in, 219 
perception vs. imagery in, 211-216, 215 
reference systems in, 221 
route/survey perspectives as part of, 220 


space and, spontaneous use of, 226-227 
spatial perception in, 216 
spatial visualization in, 216 
symbolic distance effect in, 22 
transformations in, 212-216 
updating in, 218 
visuospatial representations, 210-211 
object properties as part of, 210, 211 
visuospatial sketchpad 
memory tasks for, 461 
in multiple working memory models, 458, 459-460 
syllogisms and, 464 
vocabulary 
closed-class functional, 641 


WAIS (Weschler Adult Intelligence Scale) III, 590 
Wason selection tasks 
cognitive development and, 5 43 
content effects within, 5 43 
Deontic Selection Task and, 174-176, 178, 180 
matching bias and, 174-176 
wave length transformations, 27 
WCST (Wisconsin Card Sorting Test) 
aging and, 594,595, 670 
feedback-based concept identification in, 595 
parameters for, 595 
Weber's law 
characteristic for, 578 
numerical magnitudes and, 569, 575,576, 583 
scalar variability vs., 582 
Wechsler Memory Scale. See WMS 
Weschler Adult Intelligence Scale. See WAIS 
Weschler Intelligence Scale for Children. See WISC 
Weschler scale (intelligence testing), 753 
West Side Story, 123 
Williams v. Florida, 697 
willingness to pay. See WTP 
WISC (Weschler Intelligence Scale for Children), 355 
Wisconsin Card Sorting Test. See WCST 
Witherspoon v. Illinois, 697 
WM (working memory). See also memory 
in ACT-R, 412, 413-414, 416, 470 
aging and, 595-597 
in analogical mapping, 127-128 
analogical reasoning and, 120 
animal behavior and, 470 
attentional control function in, 467 
capacity limits for, 4.66 
in CAPS, 412 
“Confirmation Bias” limitations in, 710 
definition of, 457, 466 
EBRW and, 416 
in EPIC, 413 
fMRI and, 468 
high fidelity modeling in, 414 
individual differences in, 466-467 
integrative reasoning and, 596 
intelligence and, 757 
in LISA, 86, 135, 413, 470 
mediational analysis of, 598 
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WM (working nRreareniesbyy : Inttpss /4EtiiianargumastWechsler Memory Scale) 


modality specificity for, 467, 470 schizophrenia and, 513 

n-back task and, 466 working memory. See WM 

neural basis of, 120 WTP (willingness to pay) 

prefrontal cortex functioning and, 466 additive extension effect and, 283 

in production systems, 412-414 “add-up” rule in, 283 

relational complexity in, 468 categorical prediction similarity and, 
RPM task for, 467, 468 283 

schizophrenia and, effects on, 512 in scope neglect, 282-283 

in Soar, 413 

task as part of, 412 zone of proximal development 


variability in, 424 in cognitive development, 531 


